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Digital intuition 


A computer program that can outplay humans in the abstract game of Go will redefine our 


relationship with machines. 


Roger Federer has it in spades. The dictionary defines 

intuition as knowledge obtained without conscious reason- 
ing. It is decision-making based on apparently instinctual responses; 
thinking without thinking. 

Intuition is a very human skill, or so we like to think. Or, more 
accurately, so we liked to think. In what could prove to be a landmark 
moment for artificial intelligence, scientists announce this week that 
they have created an intuitive computer. The machine acts according 
to its programming, but it also chooses what to do on the basis of some- 
thing — knowledge, experience or a combination of the two — that 
its programmers cannot predict or fully explain. And, in the limited 
tests carried out so far, the computer has proved that it can make these 
intuitive decisions much more effectively than the most skilled humans 
can. The machines are not just on the rise, they have nudged ahead. 

Experts in ethics, computer science and artificial intelligence 
routinely debate whether clever machines in the future will use their 
powers for good or evil. This latest example of digital discovery puts 
neural networks to work on a problem that is almost as old: how to 
win at the board game Go. 

Outside business-management seminars, Go is not well known in 
the West, but it is older, more complex and harder to master than chess. 
Yet it is simpler to learn and play: two players take it in turns to place 
black or white counters on a grid. When a counter (called a stone) is 
surrounded by rivals, it is removed from the board. Winning — like 
so much in life and war — is about controlling the most territory. The 
game is wildly popular across countries in east Asia, and players from 
Japan, China and South Korea routinely compete in televised profes- 
sional tournaments. 

Computers mastered chess two decades ago, when IBM’s Deep 
Blue machine won against then-world-champion Garry Kasparov in 
1997, but Go was thought to be safe from artificial conquest. That is 
partly because all of the possible moves in Go, as well as the resulting 
combinations of stones on the board, are much too numerous for any 
computer to crunch through and compare to select one manoeuvre. 
(The same goes for chess, but the diversity in the value of chess pieces 
enables some short cuts.) In Go, all stones are worth the same and 
their influences can be felt through vast distances across the board. 

On page 484 of this issue, computer scientists at Google DeepMind 
in London unveil the successor to Deep Blue. It is a program called 
AlphaGo, and in October 2015 it beat the human Go champion of 
Europe by five games to zero. To put that into context, in Deep Blue's 
time, a human beginner with just a week's practice could easily defeat the 
best Go computer programs. A match between AlphaGo and the world’s 
most titled player of the decade is lined up for March (see page 445). 

AlphaGo cannot explain how it chooses its moves, but its pro- 
grammers are more open than Deep Blue’s in publishing how it is 
built. Previous Go computer programs explore moves at random, 


Nees had it and so did Charles Darwin. Tennis champion 


but the new technology relies on a suite of deep neural networks. 
These were trained to mimic the moves of the best human players, 
to reward wins and, using a probability distribution, to limit the 
outcomes for any board position to a single verdict: win or lose. 
Working together, these machine-learning strategies can massively 
reduce the number of possible moves the program evaluates and 
chooses from — ina seemingly intuitive way. 

As shown by its results, the moves that 


“The machine AlphaGo selects are invariably correct. But 
becomes an the interplay of its neural networks means 
oracle; its that a human can hardly check its working, or 
pronouncements _ verify its decisions before they are followed 
have to be through. As the use of deep neural network 


systems spreads into everyday life — they 
are already used to analyse and recommend 
financial transactions — it raises an interesting concept for humans 
and their relationships with machines. The machine becomes an 
oracle; its pronouncements have to be believed. 

When a conventional computer tells an engineer to place a rivet or a 
weld in a specific place on an aircraft wing, the engineer — ifhe or she 
wishes — can lift the machine's lid and examine the assumptions and 
calculations inside. That is why the rest of us are happy to fly. Intuitive 
machines will need more than trust: they will demand faith. m 


believed.” 


In praise of parks 


Our affection for national parks is well 
founded, but many more areas need protection. 


It took rather longer for politicians to set up an agency to 
actually oversee such places: they got around to that in 1916. 
So the US National Park Service celebrates its centenary this year. 

The agency also marked a shift in the way people think about parks. 
Yellowstone, which lies mostly in Wyoming, has little in common with 
the manicured gardens enjoyed by European gentry or admired by 
ancient Chinese kings. It and other huge, wild national parks are places 
where nature can supposedly be seen unmodified and unadorned, far 
from the pollution and bustle of cities. 

Like much contemporary thinking, this rather ignores the history 
of native peoples and their stewardship of swathes of land before the 
arrival of Europeans. But this relatively new idea of parks as a wild 
refuge from the modern world has taken root. The United States’ 
national parks have become some of the most iconic places in the 


{ites the world’s first national park, was created in 1872. 
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country. Yellowstone and Glacier National Park in Montana rival the 
White House and the Smithsonian as tourist attractions. 

Similarly beloved national parks exist in other countries. The United 
Kingdom protected the Peak District in 1951 and now has 15 national 
parks. China started to protect nature reserves in the 1950s and now 
has jewels such as the Zhangjiajie National Forest Park and Jiuzhaigou 
nature reserve. In 2007, Pudacuo National Park became what is some- 
times claimed to be the country’s first true ‘national park (as it reaches 
standards laid down by the International Union for Conservation 
of Nature). 

It has even been suggested that cities themselves can be parks, rather 
than just containing them. A campaign has been launched to have 
London declared a kind of urban national park. This might seem a 
backwards device — in general, parks are established in beautiful 
places that people love, not established to make places beautiful and 
encourage people to love them. But it goes to show the affection that 
many feel towards places classified as parks, be they vast national 
expanses or local patches of scrubby grass. 

This affection is not based solely on a misty-eyed yearning for the 
outdoors. There is an evidence base that parks are a good thing. Many 
studies have confirmed that they come with significant benefits. They 
seem to make people who use them healthier and happier. They make 
local ecosystems more diverse and more resilient. They can even help 
to mitigate climate change to a small degree. 

But not everyone is happy when land is set aside in parks and other 
uses are limited. In the United States, a group of armed men have 
seized — and, as Nature went to press, were still in control of — the 
Malheur National Wildlife Refuge in Oregon. Although there are a 
plethora of issues related to that act of insurgency, this event is linked 
to a dispute over attempts by the federal government to control cattle 
grazing so as to protect a species of tortoise. 

This situation might be extreme. But the story of conflict between 


park authorities and people who may once have worked inside park 
boundaries, or who wish to work there, is universal. Last week, 
60 non-governmental organizations again raised the issue of threats to 
the Virunga National Park in the Democratic Republic of the Congo, 
one of the last remaining strongholds for mountain gorillas. The 
prospect of drilling for oil in the park itself has been of concern in the 
past, and environmental groups are now warning that oil drilling in 
nearby Uganda could harm the ecosystem 
of which the park forms a part. 


iT3 . Hf 

Setting aside my Things have been equally fraught at sea. 
aclgessibeoheiled dua As governments have created more and 
should not be used more ‘marine protected areas, fishermen 
as a fig leaf, for have railed against being excluded from 
a lack of awider waters they once hauled nets in. Research- 
environmental ers have questioned whether many of these 
approach.” areas are actually protecting what needs to 


be safeguarded. And there are questions 
about just how protected some of these areas are, and whether coun- 
tries are gaming systems to hit international targets. 

The spirit of international targets to protect 17% of terrestrial areas 
and 10% of marine areas certainly intends that they be reached by pro- 
tecting places that warrant support, not those that are easy to protect 
because no one cares about exploiting what is there. 

Paradoxically, as it becomes ever more apparent that we need to 
protect areas of outstanding beauty and delicate ecology, it is becom- 
ing increasingly clear that it is not enough to do only this. Setting aside 
an area as a park should not be used as a fig leaf for a lack of a wider 
environmental approach. Cities, agricultural landscapes, wasteland 
and seas open to industry all need to be managed in a sensible and 
planned fashion. 

We need more parks. But the real challenge is to make people treat 
the whole planet with the respect that most show to their parks. m 


Found out 


Self-doubt is a pernicious affliction that can 
overwhelm researchers. 


about imposter syndrome? What do I know about it, really? 

I'm not a psychologist or a researcher or a proper expert, I'm 
just a journalist. I thought I knew what imposter syndrome was — that 
some people dont call it a syndrome as such, because that implies a 
mental disorder. And I thought that I had suffered from those feel- 
ings of doubt and inadequacy about my abilities, but now I’m not 
sure. Maybe other people just suffer from imposter syndrome more 
badly than I do. 

What if I simply tell people to go and read the Careers feature on 
page 555 that describes how imposter syndrome can affect people in 
science, and which offers some useful tips on overcoming what, as it 
turns out, are very common feelings? But then again, won't that make 
it clear that I don’t have anything else to say? 

Maybe I can deflect attention from my own pitiful performance 
by citing talented celebrities who have admitted to sometimes feel- 
ing like frauds and imposters. The multiple-Oscar-winning film star 
Meryl Streep perhaps? I’m sure I read somewhere, though I might be 
wrong, that she once said she couldn't understand why anyone would 
want to watch her on screen because she felt she couldn't act. Or the 
famous and award-gathering author Maya Angelou, who after each 
of her eleven books, said she felt that this was the time she was going 
to be found out. 

See, I have done the research. I do know what I am talking about, 


() h good grief, why did I ever say that I would write something 
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so why does it feel as if everyone around me is simply better at this 
than me? I bet that’s the way the editor thinks, too. Maybe this would 
be a good time to throw in an Einstein quote, and seek some reflected 
glory: “The exaggerated esteem in which my lifework is held makes 
me very ill at ease. I feel compelled to think of myselfas an involuntary 
swindler” 

I wish I had that Dunning-Kruger effect, the almost opposite 
experience to imposter syndrome in which people who really aren't 
qualified or knowledgeable show remarkable (and misplaced) con- 
fidence in their abilities and decisions. Life would be so much easier 
then, or at least it would seem that way. 

The thing about imposter syndrome is that it’s been known and 
written about since the late 1980s, and yet each generation of young 
scientists (and teachers, nurses, jet pilots and so on) feel isolated 
and anxious because of it. They feel that they are the only ones to 
have these crippling self-doubts, as if someone is about to tap them 
on the shoulder and confess that the whole situation — the job, the 
responsibility, the career — is an elaborate hoax and they should go 
home and stop being so presumptuous as to believe that they had 
anything to offer. 

They need to know that these thoughts and ideas are common, and 
in fact are most common among genuine high achievers. They should 
be told that rejection — of papers, grants, ideas — in science is the 
norm and that they shouldn't lose heart when it happens. After all, 
this is a field of human endeavour in which experts boast about how 
little they know and proudly display their margins of error. Young 
and vulnerable researchers need to know that if they tell someone — a 
friend or colleague or mentor — about how they 
are feeling, then they will almost certainly hear 
the words ‘me too’ and will feel better. 

I should tell them that. If only I could find the 
right words. = 
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t the beginning of this month, Prime Minister Narendra Modi 

announced a road map to guide India’s science and technology 

over the next two decades. Launched during the Indian Science 
Congress at the University of Mysore, the plan signalled a cautious 
approach to techniques such as genetically modified (GM) crops, not- 
ing that “some aspects of biotechnology have posed serious legal and 
ethical problems in recent years”. That is true, but a different and much 
larger problem looms for India. According to the 2015 United Nations 
World Population Prospects report, India will surpass China by early next 
decade as the most populous country on Earth, with the most mouths to 
feed. India is already classed as having a ‘serious hunger problem, accord- 
ing to the 2015 Global Hunger Index of the International Food Policy 
Research Institute. There is a danger that many of 
these new Indians will not have sufficient food. 

Where can additional food come from? Grain 
production is stagnant, and rapid urbanization is 
reducing available land. To increase food produc- 
tion, India needs to invest in modern agricultural 
methods, including GM crops. 

Indian researchers have shown that they have 
the expertise to generate GM plants, most obvi- 
ously the pest-resistant cotton that is now widely 
grown in India. But almost all of this work has 
relied on molecular-biology research done else- 
where — India has in effect borrowed or been 
given the genes. This leads to complications, usu- 
ally conflict over intellectual property (IP) rights. 

Most high-profile was the insecticide-produc- 
ing GM cotton variety that was released by the 
Indian Council of Agricultural Research in New 
Delhi in 2009. It was based on a Bacillus thuring- 
iensis gene to which the agricultural biotechnology company Monsanto, 
based in St Louis, Missouri, owed the IP rights. The ensuing controversy 
has seeded confusion among Indian researchers, scientific managers 
and administrators over IP rights, patents and the related rules and 
regulations. 

In response, India is turning to research based on old discoveries, 
including genes that are in the public domain or no longer protected 
by patents. The problem here is that insects have already developed 
resistance to the toxins produced by such genes: the companies that 
developed first-generation GM crops with these genes are already 
on second- and even third-generation versions of the same plants. 
Increased use of this old technology in India can only accelerate resist- 
ance and make the situation more difficult. Other developing coun- 
tries (including Pakistan) are also turning to such 
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India needs home-grown 
GM food to stop starvation 


Indian scientists must develop domestic genetically modified crops rather than 
rely on unsuitable foreign technology, says Anurag Chaurasia. 


genes in indigenous organisms and crops (such as chickpea and rice). 
Indian microbial institutes should take up projects in this direction, 
because most of the currently used genes for transgenic generation 
are of microbial origin. That requires a change in direction from an 
Indian GM-food strategy that has traditionally aimed at quick product 
development instead of careful assessment of the underlying science. 
Such home-grown GM crops would also reduce reliance on trans- 
genic technology produced by multinational companies, which is 
expensive and rarely optimized for the conditions of specific regions. 
Some GM crops designed abroad need more water than is usually avail- 
able in some parts of India, for example, putting great stress on farmers. 
Indian scientists need better training in IP issues, especially when our 
researchers join foreign collaborations to examine 
and exploit the molecular biology of our natural 
resources. Otherwise, Indian researchers may get 
the scientific credit for discoveries but fail to claim 
the right to commercialize the products developed. 
Indian regulators should exert tighter controls 
on IP rights. At present, they focus only on the 
export of physical material, such as seeds and tis- 
sue. They need also to monitor, and make claims 
on, molecular information drawn from this 
material, down to the level of genes and promoter 
regions. According to the Food and Agriculture 
Organization of the UN, India is the largest donor 
of crop germplasm to the world. Without realiz- 
ing its importance, we are giving away the rights 
to exploit one of our most precious assets. 
Agrarian India is excelling in space science, 
but it needs to focus closer to home as well. It 
needs to follow the example of China, which is 
slowly but steadily building a GM-food market that is based on domes- 
tic discoveries. Compared with China, India has three times as much 
land planted with GM crops, but whereas India’s plants were mostly 
created with technology bought from abroad, China's fields contain 
crops that were developed, tested and commercialized by Chinese 
scientists. India does not have to reject the expertise of international 
companies, but it must do more to build knowledge and skills at home. 
Mahatma Gandhi only wore clothes that he had woven himself. 
He gave India the slogan “from swadeshi to swaraj’, which means “be 
indigenous in order to self-rule”. The Indian government should take 
this message on board when planning future investment in biotech- 
nology. The theme of this month's science congress, after all, was “sci- 
ence and technology for indigenous development in India” Indigenous 
development needs indigenous research. m 


Anurag Chaurasia is a biotechnologist with the National Bureau of 
Agriculturally Important Microorganisms in Kushmaur, India. 
e-mail: govtofindia.icar@gmail.com 
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Selections from the 
scientific literature 


RESEARCH HIGHLIGHTS 


INFECTIOUS DISEASE 


Antibody for range 
of ebolaviruses 


Antibodies that recognize 
multiple ebolavirus species 
could treat the deadly 
infection. 

Humans infected with 
ebolaviruses make antibodies 
that bind to proteins on the 
surface of virus particles. 
This prevents the virus 
from infecting more cells, 
but it is unclear whether 
antibodies for one of the 
species of ebolavirus will 
work for the others. To find 
out, Alexander Bukreyev 
at the University of Texas 
Medical Branch in Galveston, 
James Crowe at Vanderbilt 
University in Nashville, 
Tennessee, and their team 
cultured antibody-producing 
B cells from seven people 
who survived a 2007 Ebola 
outbreak in Uganda, caused 
by an ebolavirus species 
called Bundibugyo. 

Some of the antibodies 
blocked two other species 
of ebolaviruses. One of the 
Bundibugyo antibodies 
protected guinea pigs that 
had been infected with 
another species of ebolavirus. 
Cell http://doi.org/bb3g (2016) 


Turbulence roils 
luminous galaxy 


The brightest-known 
galaxy is blasting gas out 
into space — and providing 
astronomers with a rare 
glimpse of how extreme 
galaxies evolve. 

Known as W2246-0526, 
the galaxy is as bright as 
350 trillion Suns and is 
powered by a supermassive 
black hole at its heart. A team 
led by Tanio Diaz-Santos at 
Diego Portales University 
in Santiago, Chile, used 


PLANT SCIENCE 


Plants count to five 


Venus flytraps count the number of touches from trapped 
insect prey before producing digestive juices. 

Erwin Neher at the Max Planck Institute for Biophysical 
Chemistry in G6ttingen and Rainer Hedrich of the University 
of Wurzburg, both in Germany, and their colleagues touched 
the leaves of Venus flytraps (Dionaea muscipula; pictured) 
to mimic captured, moving prey. They recorded the plant's 
electrical impulses in response to 1-60 touches, and found 
that two impulses triggered the trap to close. But only after 
the fifth impulse did plants begin to synthesize the digestive 
enzyme hydrolase and increase their production of a sodium 
transporter, which is used to absorb nutrients. 

Venus flytraps may record the touches of potential prey to 
identify insects that are worth digesting, the authors say. 


Curr. Biol. http://doi.org/bbzp (2016) 


the high-resolution 
Atacama Large Millimeter/ 
submillimeter Array in Chile 
to study carbon ions rushing 
outwards from the galaxy. 
The gas races out at 
speeds of about 2 million 
kilometres an hour, violently 
illuminating the surrounding 
space. W2246-0526 might 
be spewing out much of its 
energy, and could become 
more tame in the future. 
Astrophys. J. Lett. 816, L6 (2016) 
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Shielded cells 
treat diabetes 


Insulin-producing cells derived 
from human stem cells restore 
blood sugar to normal levels 
when encased in a porous 
biomaterial and implanted in 
diabetic mice. 

People with severe type 1 
diabetes can sometimes be 
treated with a transplant of 


insulin-producing cells from 
cadavers, but the cell supply 
is limited. The recipient must 
also stay on drugs to stop the 
immune system attacking the 
cells. Instead, Daniel Anderson 
of the Massachusetts Institute 
of Technology in Cambridge 
and his colleagues derived 
insulin-producing B-cells 
from human embryonic stem 
cells and encapsulated them 
ina substance called TMTD 
alginate. 

When implanted into 
diabetic mice, the coated cells 
were shielded from immune 
attack. The animals also 
maintained normal blood- 
sugar control until the implants 
were removed 174 days later. 
Nature Med. http://dx.doi. 
org/10.1038/nm.4030 (2016) 


Rising seas differ 
by region 


The expansion of oceans as the 
climate warms has contributed 
to a rise in global sea levels 
of about 1.38 millimetres per 
year — roughly twice that of 
previous estimates. 

Roelof Rietbroek at 
the University of Bonn in 
Germany and his colleagues 
analysed sea-surface heights 
from satellite radar data, and 
looked at changes in water 
storage from the Gravity 
Recovery And Climate 
Experiment (GRACE). They 
found that between 2002 and 
2014, sea levels have increased 
in total by roughly 2.74 mm 
per year, with 1.38 mm of that 
coming from ocean thermal 
expansion and 1.08 mm from 
melting ice sheets, glaciers and 
other water sources on land, 
such as rivers. 

The team also uncovered 
large regional differences. 
For example, the Philippines 
experienced a sea-level rise 
of about 14.7 mm per year, 
mostly because of thermal 
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expansion of the ocean, 
whereas the central and 
eastern Pacific saw decreases. 
Proc. Natl Acad. Sci. USA 
http://dx.doi.org/10.1073/ 
pnas.1519132113 (2016) 


Beige fat boosts 
metabolism 


Human beige’ fat cells 
implanted in mice can 
improve the animals’ glucose 
metabolism and liver-fat 
profiles. 

The presence of beige 
fat — brown fat cells within 
white fat-storing tissue — 
is correlated with better 
metabolic health, but it was 
not known whether beige fat 
causes this. To see whether 
there is a causal link, Silvia 
Corvera of the University 
of Massachusetts Medical 
School in Worcester and her 
colleagues grew human beige 
fat cells in the lab, placed them 
in mice, and found that they 
formed well-defined adipose 
tissue. Animals with the 
implants had lower blood- 
glucose levels, absorbed the 
glucose more quickly than did 
untreated controls, and had 
less fat in their livers. 

The results suggest that beige 
fat could have therapeutic use, 
the authors say. 

Nature Med. http://dx.doi. 
org/10.1038/nm.4031 (2016) 


MATERIALS 


Add water for 
3D-printed flowers 


Researchers have 3D-printed 
hydrogel composites that swell 
and morph into flower shapes 
when immersed in water. 
Lakshminarayanan 


Mahadevan and Jennifer 
Lewis at Harvard University 
in Cambridge, Massachusetts, 
and their colleagues used an 
ink made of cellulose fibrils 
embedded in a hydrogel 
matrix, which mimics plant- 
cell walls and swells in water. 
By controlling the alignment 
of the fibrils in the ink during 
printing, the team produced 
flat materials that bend and 
twist when placed in water, 
producing structures that 
mimic flowers (pictured). 
The approach could be 
used to create designer, 
shape-changing structures 
for biomedical applications or 
smart textiles, the authors say. 
Nature Mater. http://dx.doi. 
org/10.1038/nmat4544 (2016) 


Polymers woven 
into stretchy web 


Organic polymers woven 
into a 3D framework offer a 
new way of making flexible 
materials with tunable 
properties. 

Covalent organic 
frameworks are highly 
porous structures with many 
promising applications, but 
they are typically rigid. Omar 
Yaghi of the University of 
California, Berkeley, Osamu 
Terasaki of Stockholm 
University and their colleagues 
created such a framework, 
dubbed COF-505. It is made 
of individual building blocks 
of copper ions that carry 
fragments of a polymer. 
Joining these units together 
with linear molecules formed 
crystals with the same 
tetrahedral geometry as 
diamond. 

The researchers then 
removed the copper ions 
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Popular topics 
on social media 


House bugs crawl over social media 


Many commenters on Twitter this week felt their skin crawl 
after reading that some US households are home to more 
than 200 different species of insects and other creatures, 
according to one study. Entomologists collected more than 
10,000 specimens of arthropods (insects and other animals 
with exoskeletons and segmented bodies) from 50 homes 
in Raleigh, North Carolina, and found surprising diversity. 
Their results, published in Peer], suggest that the average 
home contained 93 different species, from spiders and flies to 
cockroaches and beetles. Out of the 304 arthropod families 
identified, 149 were rare. And only 5 out of the 554 rooms 
examined — 4 bathrooms and 1 bedroom — contained 
no bugs at all. Joachim Maes, an ecologist at the European 
Commission's Institute for Environment and Sustainability 
in Ispra, Italy, tweeted: “We are literally surrounded by 
biodiversity.’ The study analysed only the types of species 
present, and the authors recommend a more-in-depth study 
of confined spaces in homes — such as 
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to leave interwoven, helical 
polymer threads that were 
collectively ten times more 
elastic than the precursor. 
The copper ions could also be 
replaced, raising the possibility 
of loading the polymer weave 
with metal catalysts, or of 
using it to absorb metal ions 
from liquid waste. 

Science 351, 365-369 (2016) 


ANIMAL BEHAVIOUR 


Voles console 
stressed friends 


Prairie voles seem to console 
their distraught cage-mates 
—a behaviour previously 
seen only in humans and in 
other animals with advanced 
cognition, such as great apes 
and elephants. 

James Burkett, Larry Young 
and their colleagues at Emory 
University in Atlanta, Georgia, 
separated pairs of prairie 
voles (Microtus ochrogaster; 
pictured) in the lab and 
measured how long the 
rodents groomed each other 
when they were reunited. 
Voles spent significantly more 
time grooming partners that 


under the stairs — to get more-accurate 
data on the number and diversity of 
household bugs. 

PeerJ 4, e1582 (2016) 


had been subjected to noise 
and mild electric shocks 
during the separation period, 
even though they had not 
observed the stressful event. 
The unstressed voles 
showed the same levels of 
stress hormones as their 
stressed cage-mates. This 
response disappeared when 
the researchers chemically 
blocked the brain receptor for 
oxytocin, a hormone involved 
in empathy in humans. 
Further research on this 
consolation behaviour in 
rodents could yield insight 
into certain psychiatric 
disorders that involve a lack of 
empathy, the authors say. 
Science 351, 375-378 (2016) 
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Hottest year 


Last year was the hottest on 
record, according to data 
released on 20 January by 
NASA, the US National 
Oceanic and Atmospheric 
Administration (NOAA) 

and the UK Met Office. 

All three organizations 
document unprecedented 
high temperatures in 2015, 
pushing the global average to 
more than 1 °C above pre- 
industrial levels. A powerful 
El Nifo weather system, 
marked by warmed waters 

in the tropical Pacific Ocean, 
helped to drive atmospheric 
temperatures well above 
those for 2014, the previous 
record high. Some researchers 
suggest that broader trends 
in the Pacific could spell even 
more dramatic temperature 
increases in years to come. See 
page 450 for more. 


Davos debates 
Health, science and 
environmental issues were high 
on the agenda at the World 
Economic Forum (WEF) 
meeting in Davos, Switzerland, 
where international leaders 
met on 20-23 January. The 
2016 meeting — which 

had as its theme ‘the fourth 
industrial revolution — saw 
85 pharmaceutical companies 
demand new business models 
to incentivize the development 
of antibiotics to combat 


paz07 26 -] 


The largest known prime 
number, at more than 

22 million digits. The find 
was announced this month 
by the Great Internet 
Mersenne Prime Search 
(www.mersenne.org). 


The news in brief 


DNA search reveals new frog genus 


DNA analyses have unveiled a new genus 

of frogs. The group, Frankixalus, hails from 
northeastern India and includes a taxonomically 
challenging amphibian, first described in 1876 
and named Polypedates jerdonii. The species, 
now renamed Frankixalus jerdonii (pictured), is 
joined in the new genus by one other — currently 


drug-resistant infections. 
Environmental discussions 
included a WEF report 
predicting that the oceans 
would contain more plastic 
than fish by weight by 2050, 
and an updated Environmental 
Protection Index report from 
Yale University in New Haven, 
Connecticut, which shows an 
increased degradation of air 
quality around the world. 


Turkish protest 

An international coalition 

of 20 higher-education 
organizations has written to 
the Turkish president, Recep 
Tayyip Erdogan, urging him to 
reaffirm Turkey’s commitment 
to academic freedom and 
freedom of expression. 

The 21 January letter, from 
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organizations including 

the European University 
Association and the Scholars 
at Risk Network, voices 
concern about academics 
who signed a petition 

calling on the government 

to end violent conflict in the 
country’s predominantly 
Kurdish southeast. Erdogan 
had accused the petitions 
signatories of spreading 
terrorist propaganda, and the 
letter says that 1,128 academics 
have since been placed under 
investigation. 


Zika virus fears 
Concern over the mosquito- 
borne Zika virus outbreak in 
the Americas is growing. In 
a statement on 24 January, 
the Pan American Health 


unnamed — species. Classification of these 
tree-hole dwelling frogs had been confounded 
by their large snout vents and the high degree 

of webbing between their toes. But a genetic 
analysis led by S. D. Biju at the University of Delhi 
places the frogs in a distinct group (S. D. Biju et 
al. PLoS ONE 11, e0145727; 2016). 


Organization warned that Zika 
is likely to reach all countries 
in the region where the Aedes 
mosquitoes that transmit the 
virus live. The organization, 

a regional office of the World 
Health Organization, called 
for improved mosquito 
control, and advised people, 
especially pregnant women, 
to protect themselves from 
bites. In a risk assessment 

on 21 January, the European 
Centre for Disease Prevention 
and Control called for more 
surveillance of travellers 
returning from affected 
countries, so as to rapidly 
identify potential cases. This 
followed a 15 January travel 
warning from the US Centers 
for Disease Control and 
Prevention about Zika, which 


S.D. BIJU 


SOURCE: TRAFFIC 


HANK MORGAN, RAINBOW/SCIENCE FACTION/CORBIS 


some evidence has linked to 
babies born in Brazil with 
unusually small heads and 
brains. 


Data-sharing rule 


The International Committee 
of Medical Journal Editors 

has proposed that it should 

be mandatory for research- 
paper authors to share the 
data underlying their findings. 
In an editorial published in 
several prestigious medical 
journals on 20 January, the 
group said that authors should 
share anonymized patient 
data with other researchers 

no later than six months after 
publication (see, for example, 
D. B. Taichman Ann. Intern. 
Med. http://doi.org/bb4h; 
2016). The committee also 
said that a plan for how data 
will be shared should be part 
of the clinical-trial registration 
process. Signatories to the 
editorial include the editors- 
in-chief of The New England 
Journal of Medicine and 

The Journal of the American 
Medical Association. 


PEOPLE 


Al pioneer dies 
Artificial intelligence (AI) 
expert Marvin Minsky 
(pictured) died, aged 88, on 
24 January. After gaining his 
PhD at Princeton University 
in New Jersey, Minsky joined 


TREND WATCH | 


At least 1,312 rhinoceros 
were illegally killed in Africa 
in 2015 — arecord high for 
the continent, according to 
figures released on 21 January 
by TRAFFIC, a wildlife-trade 
monitoring network based in 


Cambridge, UK. South Africa, 
which has the largest share of 
this poaching, reported a small 
decrease — from 1,215 rhinos in 
2014 to 1,175 in 2015. But that 
was more than offset by rises in 
illegal killing in neighbouring 
countries, according to 
TRAFFIC. 


the Massachusetts Institute 

of Technology (MIT) in 
Cambridge, where he 
co-founded the AI Laboratory 
and the MIT Media Lab. He 
built the world’s first neural- 
network simulator, invented 
the confocal microscope and 
received almost all of the 
major awards in his field. 


Berkeley lab head 


Particle physicist Michael 
Witherell will become 
director of the Lawrence 
Berkeley National Laboratory 
on 1 March. The University 
of California, which manages 
the lab on behalf of the US 
Department of Energy, 
announced the move on 

21 January. Located in 
Berkeley, California, the 

lab has an annual budget 

of nearly US$800 million 

and is home to research in 
fields ranging from materials 
science to genomics. 


Witherell is currently vice 
chancellor for research at 
the University of California, 
Santa Barbara, and led the 
Fermi National Accelerator 
Laboratory in Batavia, 
Illinois, from 1999 to 2005. 


Koch quits 

Billionaire David Koch has 
left his long-standing position 
on the board of trustees of 

the American Museum of 
Natural History in New York 
City, The New York Times 
reported on 20 January. 

Koch has given millions of 
dollars to the museum, which 
named its dinosaur wing after 
him. But his presence on the 
board has been criticized 

by scientists and activists 
because the Koch family has 
also funded climate-change 
deniers. Koch’s last day on the 
board was 9 December. Koch 
remains on the advisory board 
of the Smithsonian National 
Museum of Natural History in 
Washington DC. 


Freedom plea 


More than 360 researchers 
from around the world have 
asked Iran to free retired 
polymer scientist Mohammad 
Hossein Rafiee-Fanood, 

in aletter to the country’s 
president, Hassan Rouhani. 
Rafiee, who is 71 years old, 
was imprisoned for “peaceful 
political activism’, according to 
Amnesty International, which 
regards him asa prisoner 


AFRICAN RHINO POACHING ON THE RISE 


2015 is the worst year in decades for rhino poaching — although 
South Africa reported a small decrease. 
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SEVEN DAYS | THIS WEEK | 


COMING UP 


1 FEBRUARY 

The Genome Integrity 
Discussion Group 
meets at the New York 
Academy of Sciences. 
go.nature.com/crhom3 


1-4 FEBRUARY 

The Delhi Sustainable 
Development Summit 
is hosted by TERI, the 
energy and resources 
institute. 
go.nature.com/ndvmaj 


1-5 FEBRUARY 

The Mares Conference 
on Marine Ecosystems 
Health and 
Conservation convenes 
in Olhao, Portugal. 
WWwW.maresconference.eu 


of conscience. The letter, 
published on 26 January (see 
go.nature.com/7wflr7), says 
that he is kept in inhumane 
conditions and that the 
scientist's imprisonment 
breaches both the Constitution 
of the Islamic Republic of Iran 
and the International Covenant 
on Civil and Political Rights. 


|} _CBUSINESS 
Trade theft 


Federal prosecutors in 
Pennsylvania have charged 
five people, including 

two researchers, with the 

theft of trade secrets from 
pharmaceutical giant 
GlaxoSmithKline (GSK). The 
researchers were employed at 
the company’s facility in Upper 
Merion, Pennsylvania, and 
allegedly supplied confidential 
information about GSK’s 
experimental therapies to 

a biotechnology company 
headquartered in China. 

The trade secrets in question 
include information about 

the production of therapeutic 
antibodies, including one in 
development to treat cancer. 


> NATURE.COM 
For daily news updates see: 
www.nature.com/news 
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ARTIFICIAL INTELLIGENCE 


Google masters Go 


Deep-learning software excels at complex ancient board game. 


BY ELIZABETH GIBNEY 


computer has beaten a human 
A professional for the first time at Go — 

an ancient board game that has long 
been viewed as one of the greatest challenges 
for artificial intelligence (AI). 

The best human players of chess, draughts 
and backgammon have all been outplayed by 
computers. But a hefty handicap was needed 
for computers to win at Go. Now Google’s 
London-based AI company, DeepMind, claims 
that its machine has mastered the game. 

DeepMind’s program AlphaGo beat Fan 
Hui, the European Go champion, five times 
out of five in tournament conditions, the firm 


reveals in research published in Nature on 
27 January’. It also defeated its silicon-based 
rivals, winning 99.8% of games against the 
current best programs. The program has yet 
to play the Go equivalent of a world cham- 
pion, but a match against South Korean pro- 
fessional Lee Sedol, considered by many to be 
the world’s strongest player, is scheduled for 
March. “We're pretty confident,’ says Deep- 
Mind co-founder Demis Hassabis. 

“This is a really big result, it’s huge,’ says Rémi 
Coulom, a programmer in Lille, France, who 
designed a commercial Go program called 
Crazy Stone. He had thought computer mastery 
of the game was a decade away. 

The IBM chess computer Deep Blue, which 


famously beat grandmaster Garry Kasparov in 
1997, was explicitly programmed to win at the 
game. But AlphaGo was not preprogrammed 
to play Go: rather, it learned using a general- 
purpose algorithm that allowed it to interpret 
the game's patterns, in a similar way to howa 
DeepMind program learned to play 49 different 
arcade games’. 

This means that similar techniques could be 
applied to other AI domains that require recog- 
nition of complex patterns, long-term planning 
and decision-making, says Hassabis. “A lot of 
the things we're trying to do in the world come 
under that rubric.’ Examples are using medical 
images to make diagnoses or treatment plans, 
and improving climate-change models. 
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>  InChina, Japan and South Korea, Go is 
hugely popular and is even played by celebrity 
professionals. But the game has long interested 
AI researchers because of its complexity. The 
rules are relatively simple: the goal is to gain 
the most territory by placing and capturing 
black and white stones on a 19 x 19 grid. But 
the average 150-move game contains more 
possible board configurations — 10'”° — than 
there are atoms in the Universe, so it can’t be 
solved by algorithms that search exhaustively 
for the best move. 


ABSTRACT STRATEGY 
Chess is less complex than Go, but it still has too 
many possible configurations to solve by brute 
force alone. Instead, programs cut down their 
searches by looking a few turns ahead and judg- 
ing which player would have the upper hand. In 
Go, recognizing winning and losing positions 
is much harder: stones have equal values and 
can have subtle impacts far across the board. 
To interpret Go boards and to learn the best 
possible moves, the AlphaGo program applied 
deep learning in neural networks — brain- 
inspired programs in which connections 
between layers of simulated neurons are 
strengthened through examples and experi- 
ence. It first studied 30 million positions from 
expert games, gleaning abstract information 
on the state of play from board data, much as 


other programmes categorize images from 
pixels (see Nature 505, 146-148; 2014). Then 
it played against itself across 50 computers, 
improving with each iteration, a technique 
known as reinforcement learning. 

The software was already competitive with 
the leading commercial Go programs, which 
select the best move by scanning a sample of 
simulated future games. DeepMind then com- 
bined this search approach with the ability to 
pick moves and interpret Go boards — giving 


AlphaGo a better idea 
“Deep learning _ of whichstrategies are 
is killing every likely to be success- 


ful. The technique is 
“phenomenal”, says 
Jonathan Schaeffer, a computer scientist at the 
University of Alberta in Edmonton, Canada, 
whose software Chinook solved’ draughts in 
2007. Rather than follow the trend of the past 
30 years of trying to crack games using comput- 
ing power, DeepMind has reverted to mimick- 
ing human-like knowledge, albeit by training, 
rather than by being programmed, he says. The 
feat also shows the power of deep learning, which 
is going from success to success, says Coulom. 
“Deep learning is killing every problem in AI” 
AlphaGo plays in a human way, says Fan. 
“If no one told me, maybe I would think the 
player was a little strange, but a very strong 
player, a real person” The program seems to 


problem in AI.” 


have developed a conservative (rather than 
aggressive) style, adds Toby Manning, a lifelong 
Go player who refereed the match. 

Google’ rival firm Facebook has also been 
working on software that uses machine learn- 
ing to play Go. Its program, called darkforest, 
is still behind commercial state-of-the-art Go 
AI systems, according to a November preprint’. 

Hassabis says that many challenges remain 
in DeepMind’s goal of developing a generalized 
Al system. In particular, its programs cannot 
yet usefully transfer their learning about one 
system — such as Go — to new tasks; a feat that 
humans perform seamlessly. “We've no idea 
how to do that. Not yet,’ Hassabis says. 

Go players will be keen to use the software 
toimprove their game, says Manning, although 
Hassabis says that DeepMind has yet to decide 
whether it will make a commercial version. 

AlphaGo hasn't killed the joy of the game, 
Manning adds. Strap lines boasting that Go is 
a game that computers can't win will have to 
be changed, he says. “But just because some 
software has got to a strength that I can only 
dream of, it’s not going to stop me playing.” m 
SEE EDITORIAL P.437 
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BPWNHE 


Dog DNA probed for clues 
to human psychiatric ills 


Project will compare gene data to owners’ assessments of how their companions behave. 


BY HEIDI LEDFORD 


ddie plays hard for an 11-year-old 
Ak Swiss mountain dog — she 

will occasionally ignore her advanced 
years to hurl her 37-kilogram body at an 
unwitting house guest in greeting. But she 
carries a mysterious burden: when she was 
18 months old, she started licking her front 
legs aggressively enough to wear off patches 
of fur and draw blood. 

Addie has canine compulsive disorder — a 
condition that is thought to be similar to 
human obsessive-compulsive disorder 
(OCD). Canine compulsive disorder can cause 
dogs to chase their tails for hours on end, or 
to suck on a toy or body part so compulsively 
that it interferes with their eating or sleeping. 

Addie may soon help researchers to 


determine why some dogs are more prone to the 
disorder than others. Her owner, Marjie Alonso 
of Somerville, Massachusetts, has enrolled her 
in a project called Darwin's Dogs, which aims 
to compare information about the behaviour 
of thousands of dogs against the animals’ DNA 
profiles. The hope is that genetic links will 
emerge to conditions such as canine compul- 
sive disorder and canine cognitive dysfunction 
— a dog analogue of dementia and possibly 
Alzheimer’s disease. The project organizers have 
enrolled 3,000 dogs so far, but hope to gather 
data from at least 5,000, and they expect to begin 
analysing DNA samples in March. 

“It’s very exciting, and in many ways it’s 
way overdue,’ says Clive Wynne, who studies 
canine behaviour at Arizona State University 
in Tempe. 

Researchers have long struggled to find 
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genetic links to human psychiatric disorders 
by analysing DNA samples from thousands 
of people. Those efforts have in recent years 
met with some success in schizophrenia and 
depression. But for some conditions, includ- 
ing OCD, not a single robust genetic link has 
been sifted from the background noise of 
normal genetic variation. 

Human studies are difficult in part because 
the species is so genetically diverse, says 
Wynne. Dogs, however, are more genetically 
homogeneous. Selected over thousands of 
years for particular characteristics, they dis- 
play less genetic variation than do humans. 
Pure-bred dogs, in particular, have been ren- 
dered highly genetically consistent to achieve a 
homogenous appearance and behaviour. 

Dogs also live side-by-side with humans, 
which some think can make them a better 


PHILIPPE MCCLELLAND/GETTY 


K. FRONTZEK ET AL. SWISS MED. 
WKLY 146, W14287 (2016) 


model for human disorders than mice living 
in a laboratory cage. 

These qualities have made dogs attractive 
targets for studies of analogues to human 
ailments, including epilepsy, cancer and vari- 
ous psychiatric disorders. Border collies, for 
example, may over-react to loud noises in a 
manner akin to people with anxiety disorders. 
Geneticist Elinor Karlsson of the University of 
Massachusetts Medical Schoolin Amherst and 
her colleagues have studied canine compulsive 
disorder, a condition that is particularly com- 
mon in certain breeds, including Dobermann 
pinschers. Their studies in 150 dogs have 
found possible links to four genes that encode 
proteins that act in the brain (R. Tang et al. 
Genome Biol. 15, R25; 2014). 

To expand on those results, Karlsson has 
decided to go big. Limiting her studies to 
specific breeds would make it easier to pick 
out some genetic links, but others might 
be missed. So Karlsson and her colleagues, 
including Jesse McClure, a former dog trainer 
for the US Marine Corps, decided to collect 
data from mongrels as well as pure-bred dogs 
and to crowdsource the data collection. 

That focus on mixed-breed dogs is unusual 
but shrewd, says Adam Boyko, a geneticist 
at Cornell University in Ithaca, New York. 
Although more than half of the dogs in the 
United States are mongrels, genetic studies 
tend to focus on pure-bred animals. “Genet- 
ics often deals with the interactions between 
genes,” says Boyko. “And if you want to truly 
understand those, you want to study individu- 
als where you've shuffled up the genes.” 

Human participants in Darwin’s Dogs, 
which launched last October, answer about 
130 questions about their pets’ behaviour. The 
questions cover everything from “Does your 
dog generally enjoy life?’ (the answer, says 
Karlsson, is overwhelmingly ‘yes’) to “Does 
your dog cross its paws when it lies down?’ 
Some questions were inspired by surveys that 
assess impulsivity in humans. Other ques- 
tions have been suggested by Alonso, who 
is the executive director of the International 
Association of Animal Behavior Consultants 
in Cranberry Township, Pennsylvania, and by 
other dog trainers on the basis of observations 
made over decades of working with animals 
that have behavioural problems. 

Karlsson says that she is thinking of 
expanding the list of questions even further. 
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Tail-chasing in dogs is suspected to share genetic roots with human obsessive-compulsive disorder. 


“Fortunately, it turns out that people love to 
talk about their dogs,” she says. 

Ultimately, the success of the project may 
hinge on the quality of those surveys and the 
specificity of the questions asked, says Wynne. 
Asking owners whether their dog is happy, 
for example, could yield mixed results. “One 
person's unhappy dog is another person’s com- 
fortably resting dog,” he says. “A good ques- 
tion would be: ‘Does your dog poop on the 
carpet?’ Because poop on the carpet is pretty 
damn clear” 

It is still unclear how useful the results 
from dogs will be in shedding light on human 
behavioural variation. Karlsson is hopeful that 
even if different genes are involved in the two 
species, they may converge on the same cel- 
lular pathways. Gerald Nestadt, a psychiatrist 
who specializes in OCD at Johns Hopkins 
University in Baltimore, Maryland, notes that 
affected animals often display only one type of 
compulsive behaviour, whereas a human with 
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OCD will typically have several. 

Even so, he adds, the field is hungry for 
any leads it can get. “Anything that will help 
is worth trying,” he says. “I think this project 
is a great idea” 

For their part, Alonso and other partici- 
pants are eager to learn more about their own 
dogs and why they behave the way they do. 
Miranda Workman of Buffalo, New York, 
enrolled her three dogs — Zeus, Athena and 
Sherlock — into the study, in part to gain 
insight into their behavioural quirks. Although 
Athena, a 34-kilogram Dutch shepherd, was 
bred to be a dedicated herding and guarding 
dog, she has a jovial side that is not often found 
in her breed. And Sherlock, a Jack Russell, is 
more shy and sensitive than other terriers. 

“I have some dogs that don’t necessar- 
ily fit the stereotype, says Workman. “Is it 
their environment that’s different or are they 
different? It will be fun to find out why they 
are that way.’ a 
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A black hole, visualized here in the M60-UCD1 galaxy, was thought to lose information as it disappears. 


Physicists split by 
Hawking paper 


Some welcome his latest report as a fresh way to solve a 
black-hole conundrum; others are unsure of its merits. 


BY DAVIDE CASTELVECCHI, CAMBRIDGE, UK 


Imost a month after Stephen 
Ams and his colleagues posted a 

paper about black holes online’, physi- 
cists still cannot agree on what it means. 

Some support the preprint’s claim — that it 
provides a promising way to tackle a conun- 
drum known as the black hole information 
paradox, which Hawking identified more than 
40 years ago. “I think there is a general sense of 
excitement that we have a new way of looking 
at things that may get us out of the logjam,” says 
Andrew Strominger, a physicist at Harvard 
University in Cambridge, Massachusetts, and 
a co-author of the latest paper. 

Strominger presented the results on 
18 January at a crowded talk at the University 
of Cambridge, UK, where Hawking is based. 

Others are not so sure that the approach can 
solve the paradox, although some say that the 
work illuminates various problems in phys- 
ics. In the mid-1970s, Hawking discovered 
that black holes are not truly black, and in fact 
emit some radiation’. According to quantum 
physics, pairs of particles must appear out of 


quantum fluctuations just outside the event 
horizon — the black hole’s point of no return. 
Some of these particles escape the pull of the 
black hole but take a portion of its mass with 
them, causing the black hole to slowly shrink 
and eventually disappear. 

In a paper’ published in 1976, Hawking 
pointed out that the outflowing particles — 
now known as Hawking radiation — would 
have completely random properties. As a 
result, once the black hole was gone, the infor- 
mation carried by anything that had previ- 
ously fallen into the hole would be lost to the 
Universe. But this result clashes with laws of 
physics that say that information, like energy, is 
conserved, creating the paradox. “That paper 
was responsible for more sleepless nights 
among theoretical physicists than any paper 
in history,’ Strominger said during his talk. 

The mistake, Strominger explained, was 
to ignore the potential for the empty space 
to carry information. In their paper, he and 
Hawking, along with their third co-author Mal- 
colm Perry, also at the University of Cambridge, 
turn to soft particles. These are low-energy ver- 
sions of photons, hypothetical particles known 
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as gravitons and other particles. Until recently, 
these were mainly used to make calculations 
in particle physics. But the authors note that 
the vacuum in which a black hole sits need not 
be devoid of particles — only energy — and 
therefore that soft particles are present there in 
a zero-energy state. 

It follows, they write, that anything falling 
into a black hole would leave an imprint on 
these particles. “If you’re in one vacuum and 
you breathe on it — or do anything to it — you 
stir up a lot of soft gravitons,’ said Strominger. 
After this disturbance, the vacuum around the 
black hole has changed, and the information 
has been preserved after all. 

The paper goes on to suggest a mechanism 
for transferring that information to the black 
hole — which would have to happen for the 
paradox to be solved. The authors do this by 
calculating how to encode the data in a quan- 
tum description of the event horizon, known 
whimsically as ‘black hole hair. 


TRICKY TRANSFER 
Still, the work is incomplete. Abhay Ashtekar, 
who studies gravitation at Pennsylvania State 
University in University Park, says that he finds 
the way that the authors transfer the infor- 
mation to the black hole — which they call 
‘soft hair’ — unconvincing. And the authors 
acknowledge that they do not yet know how the 
information would subsequently transfer to the 
Hawking radiation, a further necessary step. 
Steven Avery, a theoretical physicist at Brown 
University in Providence, Rhode Island, is scep- 
tical that the approach will solve the paradox, 
but is excited by the way it broadens the signifi- 
cance of soft particles. He notes that Strominger 
has found that soft particles reveal subtle sym- 
metries of the known forces of nature’, “some 
of which we knew and some of which are new”. 
Other physicists are more optimistic about 
the method’s prospects for solving the informa- 
tion paradox, including Sabine Hossenfelder of 
the Frankfurt Institute for Advanced Studies 
in Germany. She says that the results on soft 
hair, together with some of her own work, 
seem to settle a more-recent controversy over 
black holes, known as the firewall problem (see 
Nature 496, 20-23; 2013). This is the ques- 
tion of whether the formation of Hawking 
radiation makes the event horizon a very hot 
place. That would contradict Albert Einstein’s 
general theory of relativity, in which an 
observer falling through the horizon would 
see no sudden changes in the environment. 
“If the vacuum has different states,” 
Hossenfelder says, “then you can transfer 
information into the radiation without hav- 
ing to put any kind of energy at the horizon. 
Consequently, there’s no firewall? = 
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Monkeys genetically modified 
to show autism symptoms 


But itis unclear how well the results match the condition in humans. 


BY DAVID CYRANOSKI 


r | the laboratory monkeys run obsessively 
in circles, largely ignore their peers and 
grunt anxiously when stared at. Engi- 

neered to have a gene that is related to autism 

spectrum disorder in people, the monkeys are 
the most realistic animal model of the condi- 
tion yet, say their creators. Researchers hope 
that the animals will open up new ways to test 
treatments and investigate the biology of autism. 

But the jury is still out on how well the monkeys’ 

condition matches human autism. 

Autism has a vast array of symptoms and 
types, and researchers think that at least 
100 genes play a part. The scientists who led the 
latest work, which is published on 25 January 
in Nature (Z. Liu et al. Nature http://doi.org/ 
bb3k; 2016), turned to the autism-related gene 
MECP2: many of the symptoms of autism are 
found in people who have extra copies of the 
gene (MECP2-duplication syndrome) as well 
as in people who have certain mutations in 
this gene (Rett’s syndrome). Researchers have 
engineered monkeys to have autism-related 
genes before (H. Liu et al. Cell Stem Cell 14, 
323-328; 2014), but this is the first published 
demonstration of a link between those genes 
and the animals’ behaviour. 

Back in 2010, the team that did the latest 
work, led by researchers at the Chinese Acad- 
emy of Sciences’ Institute of Neuroscience in 
Shanghai, attached human MECP2 genes to 
a harmless virus, which they injected into the 
eggs of crab-eating macaque monkeys (Macaca 

fascicularis). The eggs were then fertilized, 

and the developing embryos were implanted 
into female monkeys. The result was 8 geneti- 
cally manipulated newborns, which each had 

1-7 extra copies of MECP2. Examinations of 

other, stillborn monkeys revealed that the extra 

copies were being expressed in the brain. “That 
was the first exciting moment,’ says Zilong Qiu, 

a molecular biologist at the Institute of Neuro- 

science and a co-author of the paper. 

The next breakthrough came about a year 
later, when the monkeys showed behaviours 
that hinted at autism: running around in tight 
circles in a strange manner. “Ifanother monkey 
is in its way, it will either jump over the mon- 
key, or go around it, but then it would return 
to its original circular path,” says co-author Sun 
Qiang, a reproductive biologist at the institute. 

The team launched a battery of behavioural 


A macaque made to have autism-like behaviours. 


tests, which showed that all of the monkeys 
had at least one autism-like symptom, such 
as repetitive or asocial behaviour, and that 
the symptoms were more severe in males, as 
seen in people with the MECP2 duplications. 
But this still wasn’t enough to be sure that the 
monkeys were a sound model of autism — and 
a paper that the team submitted for publica- 
tion in 2013 was rejected. Among other things, 
reviewers wanted to know whether the unusual 
behaviour was just the result of fiddling around 
with the genome. “We needed to show where 
the gene makes a difference,” says Qiu. 

That opportunity came with the next 
generation of macaques, which the team created 
with unprecedented speed. When the monkeys 
were 27 months old and not yet sexually mature, 
Sun's team took testes from the males, matured 
the tissue artificially by grafting it under the 
skin on the backs of castrated mice, and used 
the resulting sperm to fertilize eggs from non- 
engineered macaques. The offspring showed 
asocial behaviour at about 11 months. That 
both gene and symptoms seemed to be passed 
on to a second generation was finally enough to 
convince reviewers, says Qiu. 

The macaque model is “superior” to mouse 
models of autism because “it actually shows 
more clearly some of the autism-like behav- 
iours’, says Alysson Muotri, who researches 
stem cells, autism and Rett’s syndrome at the 


University of California, San Diego. But he adds 
that the symptoms in both mice and monkeys 
still seem less severe than “what we actually 
observe in human patients”. “It remains to be 
seen if the model can actually generate novel 
insights into the human condition,” he says. 

Huda Zoghbi, a pioneer of MECP2 studies 
in mice at Baylor College of Medicine in 
Houston, Texas, is even more cautious. The 
monkeys do not mimic some of the human 
MECP2-duplication symptoms, such as sei- 
zures and severe cognitive problems, she 
notes. This could be because the expression of 
the gene in the monkey model is triggered bya 
different mechanism from that in humans — a 
limitation that the authors recognize — and 
she advises caution in using the model to make 
assumptions about human autism. 

Qiu, meanwhile, is excited by the prospect of 
using the model to identify exactly where in the 
brain the MECP2 overexpression causes trouble. 
His team is already using brain-imaging tech- 
nology on the monkeys to pinpoint such areas. 
Next, the researchers plan to use the CRISPR 
gene-editing technique to knock out the extra 
MECP2 copies in cells in those regions and then 
check whether the autisim-like symptoms stop. 

It is unlikely that such a technique would 
be approved for use in people any time soon. 
But the regions identified in the monkey study 
could be targeted with other, existing treatments 
— such as deep brain stimulation, which has 

had success in treat- 


“We needed to ing Parkinson's dis- 
show where the ease and depression. 
gene makes a Because the structure 


of the mouse brain is 
so different from that 
of the human brain, Qiu says that the monkey 
imaging will allow more parallels to be drawn 
with humans than mice studies could. Working 
with a mental-health hospital, the team is also 
trying to identify the autism-linked genes that 
are most common in the Chinese population. 

If non-human primates prove to be a useful 
model for psychiatric disorders, China and 
other countries that are investing heavily in 
research on monkeys, such as Japan, could 
gain an edge in brain research. Muotri says that 
such studies probably wouldn't be done in the 
United States, where research on monkeys is 
more expensive and controversial. “China and 
Japan have a clear advantage over the US on 
this area,” he says. m 


difference.” 
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2015 breaks 
heat record 


Pacific Ocean warming 
helped to make last year the 
hottest in history. 


BY JEFF TOLLEFSON 


record. Global data show that a powerful 

El Nifio system, marked by warmed waters 
in the tropical Pacific Ocean, helped to drive 
atmospheric temperatures well past 2014’s 
record highs. Some researchers suggest that 
broader Pacific trends could spell even more 
dramatic temperature rises in years to come. 

Released on 20 January, the temperature 
data come from independent records main- 
tained by NASA, the US National Oceanic 
and Atmospheric Administration and the UK 
Met Office. All three, along with an analysis 
released by the World Meteorological Organi- 
zation on 25 January, document unprec- 
edented high temperatures in 2015, pushing 
the global average to at least 1 °C above pre- 
industrial levels and 0.16 °C above 2014. 

Although El Nifio contributed to the 
warmth late in the year, US government sci- 
entists say that the steady increase in atmos- 
pheric concentrations of greenhouse gases 
continues to drive overall warming. 

The current El Nifo is predicted to continue 
to boost average global temperatures over the 
next several months. This could translate into 
another year of record heat. But the question 
facing scientists is whether this near-record 
El Nifo has helped to flip the Pacific Ocean 
into a warmer state. 

The Pacific Decadal Oscillation (PDO) is 
a 15- to 30-year cycle that increases sea sur- 
face temperatures across the eastern Pacific 
in its positive phase and produces cooler tem- 
peratures in its negative phase. Since 1998, 
after the last major El Nifo and a subsequent 
La Nifa cooling, the PDO has been mostly 
negative. Some scientists say that the cool- 
ing helped to suppress the increase in global 
temperatures in the early part of the millen- 
nium. But since early 2014, the PDO has been 
mostly positive. 

Jerry Meehl, a climate modeller at the 
National Center for Atmospheric Research in 
Boulder, Colorado, has a study under review 
that suggests that the PDO is likely to remain 
in a positive state over the coming decade. 
“Over the next ten years,” he says, “we see 
higher rates of warming.” m 


I: official: 2015 was the hottest year on 
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Fires in Indonesia last year contributed greatly to the nation’s greenhouse-gas emissions. 


Paris deal strains 
carbon accountancy 


Developing countries need help to meet reporting rules. 


BY JEFF TOLLEFSON 


province, a lightly populated swathe 

of Borneo, is a hotbed for greenhouse 
gases that are emitted through deforesta- 
tion, family farming and industrial palm- 
oil production. Last year, these activities 
fuelled devastating fires that torched more 
than 400,000 hectares in the province and 
at least 2.5 million hectares across the 
developing nation. By some estimates, the 
fires released into the atmosphere more 
than twice as much carbon as Germany 
emits in a year. 

Calculating the volume of greenhouse 
gases emitted across such a dynamic land- 
scape is not an easy task. Nevertheless, the 
climate agreement made in Paris last month 
dictates that nearly every country will need 
to begin assembling detailed inventories of 
their greenhouse-gas emissions in a few 
years’ time. In Indonesia, the national 
government has delegated much of that 
responsibility to provincial governments. 
Soon, administrations that often strug- 
gle to provide basic public services will be 
required to master the complex science of 
carbon accountancy. 

“We cannot rely on the local govern- 
ments,” says Rizaldi Boer, who heads the 
Centre for Climate Risk and Opportu- 
nity Management in Southeast Asia and 
the Pacific in West Java. “We need to 
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integrate this kind of training into the local 
universities.” 

It is a challenge that is being faced by 
developing countries across the globe. The 
Paris agreement relies on a ‘pledge and 
review programme to reduce emissions 
and halt global warming over the course 
of the twenty-first century. Under that 
strategy, countries must document their 
progress towards voluntary commitments 
to limit carbon emissions. Solid and trans- 
parent data will be needed to verify that 
they are living up to their promises. 

“This is really the compliance regime for 
the Paris agreement,’ says Alden Meyer, 
director of strategy and policy for the 
Union of Concerned Scientists in Wash- 
ington DC. “This is how you really tell how 
well you are doing” 

Developed countries have been sub- 
mitting detailed reports on greenhouse- 
gas emissions to the United Nations for 
years. But until now, developing countries 
— which produce nearly two-thirds of 
global greenhouse-gas emissions — were 
not required to provide such comprehen- 
sive reports on a regular basis. Under the 
Paris agreement, most countries will need 
to supply inventories of greenhouse-gas 
emissions every two years — and although 
the deal includes some flexibility, the details 
have yet to be resolved. 

Efforts are under way to build a network 
of professional carbon accountants across 
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the developing world. Partnering with the 
Greenhouse Gas Management Institute 
(GHGMI), a non-profit organization head- 
quartered in Washington DC that has trained 
more than 3,000 people in carbon account- 
ing, Boer is developing a formal curriculum 
for universities in Indonesia. His focus lies 
on teaching the assessment of land-use emis- 
sions, which are the largest — and hardest to 
quantify — source of greenhouse gases in the 
country. The ultimate aim is to build a quali- 
fied workforce to feed into local governments 
as well as businesses and other institutions 
that are working to tackle climate change. 

“We can't implement this agreement 
without building capacity,’ says John Niles, 
who directs the Carbon Institute, an arm 
of the GHGMI that develops training 
programmes in the measurement and moni- 
toring of carbon emissions. “We need the 
right investments in the right institutions, in 
every country on Earth.’ 


LOCAL RESPONSIBILITY 

Although the central government of Indo- 
nesia provides basic data on deforestation, 
provincial officials must calculate emissions 
from the expanding agriculture sector as well 
as from the energy, industrial and municipal 
sectors. After the data have been collected 
and the calculations performed, Boer says, 


officials must complete and submit 186 forms 
to meet the reporting requirements laid out 
by the Intergovernmental Panel on Climate 

Change. 
“It’s very hard for the local governments 
to understand,’ he says. But if Indonesia 
succeeds in building 


“We can’t the technical capac- 
implement this ity to conduct such 
agreement assessments, it will 
without building become easier for 


the country to test 
new policies and to 
assess what will work 
best to reduce emissions in the future. “We 
have to see this as an opportunity,” says Boer. 

Cost will be a barrier. Although poorer 
countries such as Indonesia finally agreed to 
a unified framework for reporting emissions 
in Paris, they fought hard for assurances that 
wealthier countries would provide money to 
help them to kick-start the process. 

Various initiatives are already under way. 
The GHGMI has developed an online train- 
ing course in carbon accounting and is inves- 
tigating ways to provide ongoing support 
for individuals and institutions in develop- 
ing countries. And in March, a coalition of 
countries and climate-advocacy organiza- 
tions led by Germany is expected to launch a 
US$15-million initiative to help less-wealthy 


capacity.” 
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countries prepare to monitor and report their 
greenhouse-gas emissions. 

The process could function differently in 
each country. But Yamide Dagnet, a former 
climate negotiator for the UK government 
who works for the World Resources Institute 
in Washington DC, suggests that academic 
institutions can serve as repositories for 
expertise on behalf of governments, which 
often lose their most-skilled and experi- 
enced carbon accountants to the United 
Nations, think tanks and corporations. 
“You could have universities and institutes 
designed to provide data, with some perma- 
nent experts and graduate students as well,” 
says Dagnet. “I think we need to be really 
creative.” 

Perhaps the biggest challenge is how to 
move quickly. Developing countries may 
need to begin reporting within a few years, 
says Michael Gillenwater, executive direc- 
tor and dean of the GHGMI. The prevailing 
approach of hosting workshops and one-time 
training sessions to promote minimal local 
expertise while bringing in consultants to 
oversee one-off greenhouse-gas inventories 
will not be enough. 

“The current model will break if you 
try and scale it up,” says Gillenwater. “We 
need more and better people and we need a 
different model” m= 
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Songbirds area culinary delicacy iE wasnt until I saw the blade glinting in the sunlight that I realized 


. : how grave the situation was. Broad and belligerent in army fatigues, 
m Cy Prus — but catching and the man strode along the track, ranting in Greek. Behind his back, his 


a=, 


eating them is ille gal. Even so, the hands flexed a knife blade in and out of its wooden handle. This man was 
otis : a trapper, a poacher of birds — and he clearly didn’t want company. “What 

practice is on the rise and could be atesyou doing here” hédemanded 

threatening rare specie Ss. My companions and I had come to this dry scrubland on the 


Mediterranean island of Cyprus to look for evidence of songbird trap- 
ping. The birds are caught illegally and eaten in a traditional dish called 
ambelopoulia — and I was joining a September trip to monitor the extent 
of trapping. With me was Roger Little, a British conservation volunteer, 
BY SHAONI BHATTACHARYA and Savvas, a field officer with the conservation group BirdLife Cyprus 
whose name has been changed to protect his identity. We didn't expect to 
encounter trappers at this spot in the southeastern region of Cape Pyla; 
they usually work at night, when the birds are active. But now it seemed 
that they had started patrolling the site during the day. “You are on my 
land,’ the trapper said to us in Greek. 
“Tf this is your property, then I apologize — we didn’t know, we are 
going,” Savvas said. We acted casual as the man escorted us back to the 
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SOURCE: BIRDLIFE CYPRUS 


battered four-by-four in which we had come. “I shouldnt really be letting 
you go,” he muttered. Moments later, we were driving away. 

Bird trapping in Cyprus has grown into a controversy that encompasses 
crime, culture, politics and science. The practice was made illegal more 
than 40 years ago — but that simply forced it underground. Today, trap- 
pers routinely cut wide corridors through vegetation and string fine ‘mist 
nets from poles to catch the birds, which are sent to local restaurants and 
quietly served. A platter of a dozen birds sells for €40-80 (US$44-87), and 
the trade in songbirds is responsible for an estimated annual market of 
€15 million. The delicacy is so prized and lucrative that it is suspected to 
be linked to organized crime, and those trying to stop it have been subject 
to intimidation and violence. 

Conservation organizations say that the trapping is increasing and 
that it is threatening rare bird species that stop in Cyprus during their 
migration. Last March, a report by BirdLife Cyprus suggested that 
some 2 million birds had been killed in the previous autumn, including 
78 threatened species. The group claims that trapping — on top of threats 
from climate change, habitat loss and invasive species — could cause 
irreparable damage to some bird populations. “Illegal bird killing just 
cannot be justified, it’s like the last kick off the cliff for some species,’ says 
Clairie Papazoglou, executive director of BirdLife Cyprus near Nicosia. 

But the picture is not black and white, in part because the extent of bird 
killing is disputed and its effects on bird populations are unclear. Critics 
have questioned the methods used by BirdLife Cyprus to estimate the 
numbers being captured on the island. The debate led to a workshop last 
July to discuss the science, with representatives from all agencies involved. 
Attendee Alison Johnston, an ecological statistician at the British Trust 
for Ornithology (BTO), a charitable research institute in Thetford, says 
that so little is known about the population sizes and routes of migratory 
birds in the Mediterranean that it is difficult to assess the full impacts of 
trapping. “If we knew more about the numbers,’ she says, “we could say 
whether this is a critical number being killed” 

The debate over Cypruss songbirds could have wider repercussions, 
because bird killing is rife in other parts of the world. A 2015 report from 
BirdLife International estimates that hunters are killing about 25 mil- 
lion birds a year over the whole Mediterranean region; Cyprus stands 
out because so many are killed in such a small country. Globally, more 
than half of the world’s migratory bird populations are thought to be in 
decline. “This isn’t just an issue for Cyprus, or Africa, or Europe,’ says 
Claire Runge, a conservation scientist at the University of Queensland in 
Brisbane, Australia, who led a study published last December showing 
that only 9% of migratory birds worldwide are adequately protected across 
their range’. “Countries will need to work together to find a solution to 
what is essentially a human-wildlife conflict,’ she says. 

Papazoglou worries that what is happening in Cyprus sets a dangerous 
precedent. “This level of rampant illegality in an EU country sends a ter- 
rible message to the rest of the world. If rich, stable and well-run countries 
cannot enforce wildlife law, what hope is there to get fragile countries in 
the Middle East and Africa to act?” 


GATEWAY TO CONTINENTS 

Situated in the far southeastern corner of the Mediterranean, Cyprus 
is a gateway to three continents and has been fought over for millennia 
(see “Trapped in Cyprus’). It is currently sliced up into four jurisdictions: 
the Republic of Cyprus and the Turkish-occupied region of Northern 
Cyprus, separated by a UN buffer zone, and two small pockets called 
Sovereign Base Areas (SBAs) that were retained by the United Kingdom 
after the island gained independence in 1960 because of their strategic 
military importance. (Britain is currently using one of these areas to 
deploy air strikes to Syria.) 

The island's location also makes it an ideal rest-stop for migratory birds. 
Nearly half of the bird species from Europe, North Africa and the Middle 
East are thought to use the island as a migratory staging post as they fly 
south in the autumn, and back again in spring. These include common 
birds such as sparrows and the European robin (Erithacus rubecula), as 
well as threatened species including the barn owl (Tyto alba), common 


TRAPPED IN CYPRUS 


Cyprus is a key stop for migratory birds and 
a hotspot for illegal bird trapping. Surveys 
estimate that trapping is on the rise. 


Bird trappers clear corridors through 
trees and string nets across them. 


Famagusta 


Migrating birds 
tend to congregate 
in certain parts of 
the island such as 
Cape Pyla. 


Surveys suggest 
trapping is a particular 
problem in the SBAs. 


Main trapping areas 
@ UN buffer zone 

@ British Sovereign 
Base Areas (SBAs) 


Autumn trapping activity 


Estimated trapping rea 


a record high 
— more than 2 million birds — in 2014. 
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kingfisher (Alcedo atthis) and European turtle dove (Streptopelia turtur). 
All of these creatures have been found in the trappers’ nets, as have some 
threatened, non-migratory, endemic bird species such as the Cyprus war- 
bler (Sylvia melanothorax) and the Cyprus wheatear (Oenanthe cypriaca). 

The practice of trapping dates back to a time when birds were among 
the few easily found sources of protein on this arid island. Originally, 
ambelopoulia would have been a plate of blackcaps (Sylvia atricapilla), 
but the dish has extended to include 22 species of songbird. The tradi- 
tional trapping method is to ensnare birds in trees with strategically placed 
‘limesticks’ — twigs coated in a goo of mud mixed with Syrian plum juice. 
But in 1974, laws were introduced to ban non-selective capture methods, 
including limesticks and mist nets. Bird trapping is also illegal under the 
European Union (EU) Birds Directive and the Convention on the Con- 
servation of European Wildlife and Natural Habitats (known as the Bern 
Convention), both of which Cyprus has adopted. 

The practice never stopped. Many Cypriots argue that bird trapping 
for ambelopoulia is a tradition and a right, and it has become a highly 
emotive issue. In the Famagusta district, raids on restaurants and arrests 
related to bird trapping have sparked public protests, and some politicians 
either covertly or overtly support it. Last December, Evgenios Hamboul- 
las, Famagusta member of parliament for the incumbent Democratic 
Rally party, posted a photo of himself on Facebook seated in front of 
a plate of songbirds with the caption: “Soon in our restaurants! Happy 
holidays!” The post received nearly 600 likes in 5 days, and condemna- 
tion from his party. 

Conservation groups believe that bird trapping is rising fast; last year’s 
BirdLife Cyprus report said that the practice had reached “industrial 
scale”. Trappers rip out the island’s native scrub bushes, then plant and 
irrigate lush, bright-green acacia trees that attract birds. They cut cor- 
ridors through the groves and string mist nets across them from poles. 
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DNA identifies baked birds 


A DNA technique used to identify the provenance of foods is being 
turned into the latest crime-fighting tool to tackle illegal bird killing in 
Cyprus. 

About 22 bird species are served up illegally by Cypriot restaurants 
in a traditional cuisine called ambelopoulia (pictured). The hope is 
that DNA barcoding — a technique that uses DNA sequences to 
identify a species — can show whether a restaurant owner is passing 
off illegally trapped birds as chicken or another meat during a raid. 
Proof that the birds being eaten are protected species might help to 
build a stronger case to prosecute lawbreakers. 

The project is a three-year collaboration between researchers at 
the University of Pisa, Italy, and the University of Cyprus, Nicosia, 
as well as the Ministry of the Interior’s Cyprus Game and Fauna 
Service and the conservation group BirdLife Cyprus. So far, the 
team’s unpublished work has shown that sequences from part of a 
single gene (for cytochrome c oxidase) are enough to distinguish 81 
bird species. This worked even when the DNA was extracted from 
meat baked at 90°C, and cooked with salt or vinegar — a method 
that matches local gastronomy but that could degrade the DNA. 

“{It was] prepared in a particular way so we would be sure our DNA 
investigations would be effective,” says Filippo Barbanera, a zoologist 
at the University of Pisa. 

The Pisa team has helped to set up a molecular-genetics lab at the 
University of Cyprus to do the DNA analyses. It has also trained two 
game-service officers, so that they can testify about the forensic DNA 
evidence in court. So far, the method has been used in two cases that 
are pending court, says Panicos Panayides, an officer at the game 
service in Nicosia. “We have a new card to be played to stop, or to 
try to reduce, the rate of illegal trade and consumption of birds in 
Cyprus,’ says Barbanera. S.B. 


When Sawvas, Roger and I stopped at a well-known trapping hotspot, the 
evidence was everywhere: metal poles were concreted into bases made of 
empty tyres; black irrigation pipes criss-crossed the dusty earth; old car- 
pets covered the ground to stop vegetation growing where the nets hang. 

Earlier in the trip, we found an MP3 player high in an acacia tree 
broadcasting a repetitive birdsong — a ‘tape lure’ used to attract the birds. 
Nearby, a red-backed shrike (Lanius collurio) and a sparrow both flailed 
frantically, their feet and wing tips glued onto limesticks balanced high 
in the tree. 


DEATH TOLL 

Conservationists first started systematically monitoring the extent of 
bird trapping in 2002, using a protocol developed by BirdLife Cyprus 
and the United Kingdom’s Royal Society for the Protection of Birds, 
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in consultation with the Cyprus Game and Fauna Service (part of the 
Ministry of Interior) and the British SBA police. The figures showed 
an initial dip in trapping around the time that Cyprus acceded to the 
EU, then an upward trend from 2007. But the 2014 trapping figures, 
published last year, caused a particular stir. 

The 2 million birds that BirdLife Cyprus estimated were captured 
during the previous trapping season was the biggest jump since moni- 
toring had begun. The report also broke down trapping trends by 
jurisdiction, and found that the SBAs accounted for much of the 
increase. It estimated that 900,000 birds were killed there — even 
though the regions take up only 3% of the land — and that there had 
been a 199% increase since 2002. By contrast, the Republic of Cyprus 
had seen a downturn in illegal trapping. (Bird killing is not thought 
to be a major problem in Northern Cyprus.) 

The record-breaking numbers prompted criticism and headlines, 
and led some conservationists and media to imply that the British 
authorities were turning a blind eye to trapping so as not to upset 
the local community. The SBA Administration told BirdLife Cyprus 
that it “does not accept the survey findings” and questioned some of 
the figures in the report. According to people at the July meeting, the 
administration was particularly concerned with how the estimates 
were reached. 

The main trapping survey is carried out over a six-week period in 
the autumn migratory season — also the main bird-hunting season. 
The surveillance team regularly visits 60 sites, each one kilometre 
square, that are deemed prime trapping territory and assigns them 
one of five categories on the basis of the scale of mist netting that 
it observes — from ‘active set net’ (where trappers have left a net 
unfurled on poles), to ‘prepared’ (where undergrowth has been 
freshly cut to produce a corridor, but no nets are present), to ‘clear’ 
(areas with no evidence of trapping). 

From these data, the team estimates how many birds are killed in 
the region and season overall. To do this, it must make assumptions, 
such as the number of birds caught in a net each day and that bird 
migration is relatively constant, when in reality it occurs in waves. 
“We always say that our estimate of numbers caught is full of assump- 
tions,’ Papazoglou says. “It needs to be read with a lot of caveats.” 

One of the major contentions of the SBAs is over the ‘prepared’ 
category, Johnston says — because deciding whether an area is about 
to be used for trapping is to some extent subjective. And others have 
expressed concerns about the accuracy of the estimates. “We have 
some doubts over the specifics of the monitoring and the exact num- 
bers,” says Panicos Panayides, an officer at the Cyprus Game and 
Fauna Service in Nicosia. 

The July workshop was convened to address these methodology 
issues. BirdLife Cyprus invited Johnston, her colleague Nick Moran, 
who runs a major British bird survey, other bird-monitoring experts 
and representatives of the SBAs. After the workshop, Johnston and 
Moran advised BirdLife Cyprus to do away with the ‘prepared’ cate- 
gory and to increase the number of squares sampled within the SBAs, 
among other recommendations. BirdLife Cyprus will adopt these in 
its 2016 analysis, which should be published this spring. In a state- 
ment to Nature, the SBA Administration said that it did not wish to 
comment on the methodology used previously, that “all groups are 
working together to refine the recommendations produced by the 
BTO”, and that “it is imperative that we continue to work together to 
counter the practice [of bird trapping]”. 

But even with some adjustments, the trapping figures still jumped 
between 2013 and 2014, says Johnston. “The [new] equation slightly 
reduces the estimated number, but not by much, by about 10%.” And 
the year-on-year trend towards increased trapping is sound, she says, 
because the monitoring methods have been consistent over time. 
If anything, she thinks that BirdLife Cyprus’s estimated numbers 
are “conservative”. On the basis of previous studies, the group esti- 
mates that about 20 birds are captured in each net per day. But this 
figure could be much higher if, as is common today, trappers use 
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taped songs to attract birds. One study” estimated that such lures 
can increase the number of birds flying into traps by up to 13-fold. 

Elsewhere in the world, hunting is thought to be playing a part 
in the demise of even common bird species. Last year, researchers 
warned that a highly abundant Eurasian bird, the yellow-breasted 
bunting (Emberiza aureola), had lost as much as 95% of its population 
in the past three decades or so and was close to extinction in parts of 
its range. One major driver is thought to be the trapping of birds in 
China, where they are served as an expensive delicacy’. 

Accurately measuring the extent of bird killing is important if 
researchers and conservationists are to gauge the damage being 
done to bird populations, and to encourage efforts to clamp down. 
But Johnston says that getting rock-solid data is extremely dif- 
ficult — particularly when visiting the monitoring sites is fraught 
with danger. “If the trappers had to fill in a form and say how many 
birds they caught on different days — we could do a great analysis,” 
she says. 

Runge says that low-level hunting of common species may not have 
a huge impact on populations. “For other endangered species, where 
only a few individuals are left, it can be really critical.” And whatever 
the precise numbers, all the agencies involved agree that bird killing 
in Cyprus needs to be tackled. The question is, how. 


POLITICAL SENSITIVITIES 

Jim Guy, divisional commander of the eastern SBA police, is polite, 
charming and hard as nails. I met him at the police station in Dhek- 
elia, a cluster of low-lying buildings behind a wire fence set off the 
road, a few kilometres from the city of Larnaka. He‘ originally come 
from Glasgow on a 3-year posting, but has ended up staying for 17. 

“As far as the bases themselves are concerned, there’s no denying 
it’s one of the main trapping areas,” he says. But Guy seems aggrieved 
about the criticism aimed at the SBAs since BirdLife Cyprus’s report, 
and says that lax enforcement is not to blame. Rather, he says, the 
eastern SBA — and especially the promontory of Cape Pyla — is 
a target for trappers because it is a key stopping point in the flight 
path of migratory birds. “Cape Pyla in particular has no buildings or 
houses or anything to deter, or put off birds, so it’s an ideal situation.” 

Guy says that his team takes a three-pronged approach to tackle 
trappers: prevention, education and enforcement. “To some extent, 
enforcement is an Elastoplast,’ he says. It might catch some trap- 
pers, but the practice will continue as long as there is demand for 
high-priced ambelopoulia from diners and the restaurants that serve 
it — and these lie almost entirely in the Republic. 

Stopping that demand is extremely difficult, Guy adds. “The illegal 
practice in some cases is overtly or very often tacitly supported by 
people in very high political and administrative positions.” What’s 
more, officers trying to tackle trapping can find themselves threatened 
or worse. “In the UK, you can go home at night and you don’t have 
to think about your home or your family being attacked,” says Guy, 
who has had officers seriously assaulted while dealing with trappers. 

His sense of frustration is shared by Panayides. The walls of his 
office are lined with pictures of birds, and an EU Birds Directive poster 
perches above the table. Panayides says that there have been at least 30 
cases in the past decade in which game-service officers responsible for 
wildlife enforcement in the republic were harassed by trappers. “We've 
had people put bombs in the private cars of game wardens, and cases 
where the houses of game wardens have been burnt down,” he says. 

Even when trappers are caught, Panayides says, the weak punish- 
ments imposed by courts are not effective deterrents. Technically, 
Cypriot law allows a first-time trapper to be jailed for up to 3 years, 
or fined up to €17,000. In reality, most get off with a fine of a few 
hundred euros. Panayides tells of one poacher whom his team has 
caught and prosecuted eight times over the past decade. “What else 
can we do as a department?” he says disconsolately. 

The fight escalated last year. In May, a previously agreed plan 
to deal with bird killing was passing through Cyprus’s Council of 
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Blackcaps (top) and European bee eaters are both trapped in nets in Cyprus. 


Ministers when the government added a last-minute clause that 
would allow selective hunting of blackcaps for ambelopoulia. The 
move caused an outcry in environmental organizations, because any 
method used to capture blackcaps would inevitably catch other spe- 
cies and is in breach of the Birds Directive. In August,the altered plan 
was rejected by the European Commission in a letter to the Cyprus 
government, and observers are now waiting to see how the govern- 
ment will respond. 

Meanwhile, authorities in both the republic and the SBAs are 
stepping up efforts to curb bird killing. The republic authorities are 
looking at the use of a genetic technique known as DNA barcoding to 
identify the birds served up at restaurants (see ‘DNA identifies baked 
birds’), and the SBA Administration says that it removed 11 football- 
pitches’ worth of planted acacia from the central poaching area of 
Cape Pyla last summer. The removal met with demonstrations, and 
people sat in the dirt tracks to stop the clearance contractors. In the 
area where we encountered the knife-bearing poacher, the monitor- 
ing team now enters only ifit has a police escort. The conservationists 
and the poachers have reached “the top-end of the fight’, says Savvas, 
who has been monitoring trapping on the island for nearly five years. 

Surveillance and enforcement will only go so far: most parties 
agree that the only real way to tackle bird killing is through educa- 
tion and social change. “The general public has to recognize that this 
is not correct,” says Panayides. “Not just legally, but also morally and 
socially.” Papazoglou, too, is realistic about what needs to be done. 
“If we dont get the minds and hearts of people to change — we will 
never change it,” she says. m 


Shaoni Bhattacharya is a science writer in London. 
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There are at least six things in this picture that a quality-assurance manager would try to improve. Can you spot them? 


QUALITY TIME 


IT MAY NOT BE SEXY, BUT QUALITY ASSURANCE 
IS BECOMING A CRUCIAL PART OF LAB LIFE. 


BY MONYA BAKER 


ebecca Davies remembers a time when quality assurance terrified her. In 2007, 
she had been asked to lead accreditation efforts at the University of Minnesota’s 
Veterinary Diagnostic Laboratory in Saint Paul. The lab needed to ensure that the 
tens of thousands of tests it conducts to monitor disease in pets, poultry, livestock 
and wildlife were watertight. “It was a huge task. I felt sick to my stomach,’ recalls 
Davies, an endocrinologist at the university’s College of Veterinary Medicine. 
She nevertheless accepted the challenge, and soon found herself hooked on finding — 
and fixing — problems in the research process. She and her team tracked recurring tissue- 
contamination issues to how containers were being filled and stored; they traced an assay’s 
erratic performance to whether technicians let an enzyme warm to room temperature; and 
they established systems to eliminate spotty data collection, malfunctioning equipment and 
neglected controls. Her efforts were crucial to keeping the diagnostic lab in business, but they 
also forced her to realize how much researchers work could improve. “That is the beauty of 
quality assurance,’ Davies says. “That is what we were missing out on as scientists.” 
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Davies wanted to spread the word. In 2009, she got permission and 
financial support to launch an internal consulting group for the col- 
lege, to help labs with the dry but essential work of quality assurance 
(QA). The group, called Quality Central, now supports more than half 
a dozen research labs — helping them to design systems to ensure that 
their equipment, materials and data are up to scratch, and helping them 
to improve. 

She is also part of a small but growing group of professionals around 
the world who hope to transform basic biomedical research. Many were 
hired by their universities to help labs to meet certain regulatory stand- 
ards, but these QA consultants have a broader vision. They are not push- 
ing for universal adoption of formal regulatory certifications. Instead, 
they advocate ‘voluntary QA’. With the right strategies, they argue, 
scientists can strengthen their research and improve reproducibility. 

When Davies first started proselytizing to her fellow faculty 
members, the responses were not encouraging. “None of them found 
the idea compelling at all” Davies recalls. How important could QA be, 
they asked, if the US National Institutes of Health did not require it? 
How could anyone afford to spend money or time on non-essentials? 
Shouldn't they focus on the discoveries lurking in their data, and not 
the systems for collecting them? 

But some saw the potential, based on their own experiences. Before 
she had heard of Quality Central, University of Minnesota virologist 
Montserrat Torremorell was grateful when a colleague let her use his 
instruments to track transmissible disease in swine. But the results 
made no sense. Samples from pigs experimentally infected with influ- 
enza showed extremely low levels of the virus. It turned out that her 
benefactor had, like many scientists, skimped on equipment mainte- 
nance to save money. “It was a real eye-opener,’ Torremorell recalls. “It 
just made me think that I could not rely on other people’s equipment” 


QUALITY FOR ALL 

Quality systems are an integral part of most commercial goods and 
services, used in manufacturing everything from planes to paint. Some 
labs that focus on clinical applications implement certified QA sys- 
tems such as Good Clinical Practice, Good Manufacturing Practice 
and Good Laboratory Practice for data submitted to regulatory bodies. 
There have also been efforts to guide research practices outside these 
schemes. In 2001, the World Health Organization published guide- 
lines for QA in basic research. And in 2006, the British Association of 
Research Quality Assurance (now simply the RQA) in Ipswich issued 
guidelines for basic biomedical research. But few academic researchers 
know that these standards exist (Davies certainly didn’t back in 2007). 

Instead, QA tends to be ad hoc in academic settings. Many scientists 
are taught how to keep lab notebooks by their mentors, supplemented 
perhaps by a perfunctory training course. Investigators often impro- 
vise ways to safeguard data, maintain equipment or catalogue and care 
for experimental materials. Too often, data quality is as likely to be 
assumed as assured. 

Scientific rigour has taken a drubbing in the past few years, with 
reports that fewer than one-third of biomedical papers can be repro- 
duced (see Nature http://doi.org/477; 2015). Scientific culture, training 
and incentives have all been blamed for promoting sloppy work; a com- 
mon refrain is that the status quo values publication counts over careful 
experimentation and documentation. “There is chaos in academia,’ 
says Masha Fridkis-Hareli, head of ATR, a biotechnology consultancy 
in Worcester, Massachusetts, that also conducts laboratory work to 
help move basic research into industry. For every careful researcher 
she has encountered, there have been others who have thought noth- 
ing of scribbling data on paper towels, repeating experiments without 
running controls and guessing at details months after an experi- 
ment. Davies insists that plenty of scientists are 


doing robust work, but there is always room for NATURE.COM 
improvement (see ‘Solutions’). “There are easy | How does your lab’s 
fixes to situations that shouldn't be happening, QAmeasure up? 


but are,” she says. go.nature.com/e6xupg 
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Michael Murtaugh, a swine biologist at the University of Minnesota, 
had tried to establish practices to beef up the reliability of his team’s 
lab notebooks, but the attempts that he made on his own never gained 
traction. Then Davies got on his case. After a year or so of her “planting 
seeds” — as she puts it — Murtaugh agreed to work with Quality Central 
and implement a low-tech but effective solution. 

On designated Mondays, each member of Murtaugh’s lab draws a 
name from a paper bag to determine whose notebook to audit. The 
scientists check that their assigned books include relevant controls for 
experiments, and indicate where data are stored and which particular 
machine generated them. The group also makes sure that any prob- 
lems noted in the previous check have been addressed. It takes about 


“THERE ARE EASY 


his notebook legible and up 
to date. “I never used to put 
in raw data,’ he says. 
and his own efforts to implement a tracking system were inadequate. 
He turned to a university-based QA consulting service for help. Now, 
samples, equipment and their data are all linked with tracking numbers 


every few weeks, but that’s 
enough to change people's 
habits. Graduate student 

FIXES 10 SITUATIONS Michael Rahe says that the 
checks ensure that he keeps 

THAT SHOULDN'T BE 

LU] 

HAPPENING, BUT ARE. Albert Cirera, a tech- 
nologist developing gas 
nanosensors at the Univer- 

sity of Barcelona in Spain, has also embraced QA. As his lab group grew 

to 12 people, he found it difficult to monitor everyone's experiments, 
printed on stickers and recorded in individuals’ notebooks, on samples 
and in a central tracking file. The system does not slow down experi- 
ments, and staying abreast of projects is a breeze, says Cirera. But getting 
to this point took about four months and frequent consultations. “It was 
not something that you can create from zero, he says. 


MAKING A MARKET 

Any scientist adopting a QA system has to wager that the up-front 
hassle will pay off in the future. “It is very difficult to get people to 
check and annotate everything, because they think it is nonsense,” says 
Carmen Navarro-Aragay, head of the University of Barcelona quality 
team that worked with Cirera. “They realize the value only when they 
get results that they do not understand and find that the answer is lurk- 
ing somewhere in their notebooks.’ 

Even when experiments go as expected, quality systems can save 
time, says Murtaugh. Methods and data sections in papers practically 
write themselves, with no time wasted in frenzied hunting for missing 
information. There are fewer questions about how experiments were 
done and where data are stored, says Murtaugh. “It allows us to con- 
centrate on biological explanations for results.” 

The more difficult data are to collect, the more important a good 
QA system becomes. Catherine Bens, a QA manager at Colorado State 
University in Fort Collins, says that she remembers getting cold, wet 
and dirty when she had to monitor a study involving ultrasound scans 
and blood samples from a population of feral horses in North Dakota. 
Typical animal-identification practices such as ear tagging were not 
allowed. So, before the collection started, Bens supported researchers 
as they rehearsed procedures, pre-labelled tubes, made back-up labels 
and recruited animal photographers and park volunteers to ensure 
that samples would be linked to the correct animals. Even in a snow 
storm with winds so loud that everyone had to shout, the team made 
sure that each data point could be traced. 

Rare samples or not, few basic researchers are clamouring to get QA 
systems in place. Most are unfamiliar with the discipline, says Davies. 
Others are hostile. “They see it as trying to constrain them, and that 
youre making them do more work.” 

Before awarding certain grants, the Found Animals Foundation in 
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Los Angeles, California, which funds research on animal sterilization, 
requires proof that instruments have been calibrated and that written 
plans exist for tracing data and dealing with outliers. It can bea strug- 
gle, says Shirley Johnston, scientific director of the foundation. One 
grant recipient argued that QA systems were unnecessary because just 
looking over the data would reveal their quality. 

Part of the resistance may be down to how some QA professionals 
present themselves. “A lot of them are there to tell you what you are 
doing is wrong, and a lot of them are not very nice about it? says Terry 
Nett, a reproductive biologist at Colorado State University who expe- 
rienced this first-hand when he worked with outside consultants to 
incorporate Good Laboratory Practice principles in his lab. The effort 
was frustrating. “Instead of helping us understand, they would act 
like a dictator,’ Nett recalls. “I just didn’t want them in my lab.” A few 
years ago, however, the university hired its own quality managers, and 
things changed. The current manager, Bens, acts more like a partner, 
Nett says. She points out where labs are already using robust practices, 
and explains the reasoning behind QA practices that she introduces. 

To win scientists over, Bens stresses that QA systems produce data 
that can withstand criticism. “You build a support system around 
any data point you collect,” she says. When there is a strange result, 
researchers have documentation to trace its provenance. That can 
show whether a data point is real, an outlier or a problem — for exam- 
ple ifa blood sample was not kept cold or was stored in the wrong tube. 

Scientists need to take the lead on which QA elements they incor- 
porate, says Melissa Eitzen, director of regulatory operations at the 
University of Texas Medical Branch in Galveston. “You want to give 
them tips that they can take or not take,” she says. “If they choose it, 
they'll do it. If you tell them they have to do it, that’s a struggle.” 

Rapport is paramount, says Michael Jamieson at the University of 
Southern California in Los Angeles, who helps other faculty mem- 
bers to move research towards clinical applications. Instead of talking 
about quality systems, he prefers to discuss concrete behaviours, such 
as labelling bottles with expiry dates and storage conditions. QA jar- 
gon puts scientists off, he says. “Using the term good research practice 
makes most researchers want to run the other way.” 

It’s alesson that many QA specialists have taken to heart. Some say 
‘assessment or ‘quality improvement instead of ‘audit’ Even ‘research 
integrity’ can be an inflammatory phrase, says Davies. “You have to 
find a way to communicate that QA is not punitive or guilt-inspiring”” 


NOT INTO TEMPTATION 
Having data that are traceable — down to who did what experiment on 
which machine, and where the source data are stored — has knock-on 


benefits for research integrity, 

says Nett. “You can’t pick out the i“ 5 

data that you want.” Researchers | CAN T i FL p B lJ T 

who must provide strong expla- 

nations about why they chose THI NK THAT 0A IS 

to leave any information out of 

their analysis will be less tempted IF 0 N F T 0 Ml A K F 

weed out digital meddling: pop- FR A lJ I H A R I f iM ii 

ular spreadsheet programs such " 

as Microsoft Excel can be vulner- 

able to errors or manipulation if not properly locked, but QA teams can 

set up instruments to store read-only files and prevent researchers from 

tampering with data accidentally or intentionally. “I can’t help but think 
And good quality systems can be contagious. Melanie Graham, 

who studies diabetes at the University of Minnesota, often collabo- 

rates with others to test potential treatments. More than once, she says, 

collaborators have sent her samples in a polystyrene tube with nothing 

but a single letter written on it. Graham sends it back and requests a 

label that specifies the sample's identity and provenance, and a range 


to cherry-pick data. QA can also 
that QA is going to make fraud harder,” says Davies. 
of storage temperatures. ‘Keep frozer is too vague — she will not risk 
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There are many things wrong with the fictitious 
lab shown on page 456. But, here are six that a 
quality-assurance manager would identify, and 
how they would solve them. 


DISORGANIZED SAMPLE STORAGE 

Clear labelling and proper organization are important for 
incubators and freezers. Everyone in the lab should be 
able to identify a sample, where it came from, who did 
what to it, how old it is and how it should be stored. 


INADEQUATE DATA LOGGING 

Data should be logged in a lab notebook, not scribbled 
onto memo paper or other detritus and carelessly 
transcribed. Notebooks should be bound or digital; 
(al — loose paper can too easily be lost or removed. 


Mea lal 


VARIABLE EXPERIMENTS 

Protocols should be followed to the letter or 
deviations documented. If reagents need to be kept 
on ice while in use, each lab member must comply. 


UNSECURED DATA ANALYSIS 

Each lab member should have their own password for 
accessing and working with data, to make it clear who 
works on what, when. Some popular spreadsheet 
programs can be locked down so that manipulating 
data, even accidentally, is difficult. 


MISSED MAINTENANCE 
Instruments should be calibrated and maintained 
according to a regular, documented schedule. 


OLD AND UNDATED REAGENTS 

These can affect experimental results. Scientists 
should specify criteria for age and storage of all 
important reagents. 


performing uninformative experiments because reagents stored in a 
standard freezer were supposed to be kept at —80°C. 

When she first sent documentation requirements to collaborators, 
she expected them to push back. Instead, reactions were overwhelm- 
ingly positive. “It’s a relief for them,” says Graham. “They want us to 
handle their test article in a trusted way.” 

The benefits go beyond providing solid data. In 2013, Davies worked 
with Torremorell and other Minnesota faculty members on a proposal 
to monitor and calibrate equipment used by several labs. The plan that 
they put in place helped them to secure US$1.8 million to build shared 
lab space to deal with animal pathogens, says Torremorell. “If we want 
to be competitive to get funding, and if we want people to believe our 
data, we need to be serious about the data that we generate.” 

Davies is still trying to spread the word. Her invitations to give talks 
and review grant applications have mushroomed. She and collaborators 
at other institutions have been developing online training materials 
and offering classes to technicians, postdocs, graduate students and 
principal investigators. After a presentation last year, a member of the 
audience told her that he had reviewed a grant from one of her clients; 
the QA plan had made the application stand out in a positive way. 
Davies was delighted. “I could finally come back to my folks and say, 
‘Tt was noticed.” 

Davies knows it is still an uphill battle, but her ultimate goal is to 
make QA as much a part of research as peer review. It may not have 
the flash and dazzle of other efforts to ensure that research is robust 
and reproducible, but that is not the point. “A QA programme isn't 
sexy, says Michael Conzemius, a veterinary researcher at the Univer- 
sity of Minnesota and another client of Quality Central. “It’s just kind 
of become the nuts and bolts of the scientific process for us.” = 


Monya Baker writes for Nature from San Francisco, California. 
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Don’t let transparency 
damage science 


Stephan Lewandowsky and Dorothy Bishop explain how the research 
community should protect its members from harassment, while 
encouraging the openness that has become essential to science. 


" | “ransparency has hit the headlines. 
In the wake of evidence that many 
research findings are not reproduc- 

ible’, the scientific community has launched 

initiatives to increase data sharing, transpar- 
ency and open critique. As with any new 
development, there are unintended conse- 
quences. Many measures that can improve 
science’ — shared data, post-publication 


peer review and public engagement on social 
media — can be turned against scientists. 
Endless information requests, complaints 
to researchers’ universities, online harass- 
ment, distortion of scientific findings and 
even threats of violence: these were all 
recurring experiences shared by researchers 
from a broad range of disciplines at a Royal 
Society-sponsored meeting last year that we 


organized to explore this topic. Orchestrated 
and well-funded harassment campaigns 
against researchers working in climate change 
and tobacco control are well documented**. 
Some hard-line opponents to other research, 
such as that on nuclear fallout, vaccination, 
chronic fatigue syndrome or genetically mod- 
ified organisms, although less resourced, have 
employed identical strategies. > 
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> Such attacks place scientists in a 
difficult position. Good researchers do not 
turn away when confronted by alterna- 
tive views. However, their openness can be 
exploited by opponents who are keen to stall 
inconvenient research. When people object 
to science because it challenges their beliefs 
or jeopardizes their interests, they are rarely 
committed to informed debate. 

The progress of research demands trans- 
parency. But as scientists work to boost 
rigour, they risk making science more vul- 
nerable to attacks. Awareness of tactics is 
paramount. Here, we describe ways to dis- 
tinguish scrutiny from harassment. 


USE AND ABUSE 

We have identified ten red-flag areas that can 
help to differentiate healthy debate, prob- 
lematic research practices and campaigns 
that masquerade as scientific inquiry (see 
“Ten red flags’). None by itself is conclusive, 
but a preponderance of troubling signs can 
help to steer the responses of scientists and 
their institutions to criticism. 

We also examine five legitimate tools of 
scholarly exchange, how they can be ‘weap- 
onized’ (see ‘Five doubled-edged tools’ for 
a summary) and how to protect openness 
while curtailing its abuse. 


Calls for open data: checking versus under- 
mining. Many organized attacks call for more 
data, often with the aim of finding an analy- 
sis method that makes undesirable results go 
away”. The tobacco industry sponsored and 
drafted US legislation to enhance access to 
data on tobacco research, with the intention 
to delay or prevent evidence-based public- 
health measures’. Calls for more data can also 
be used to create the false impression that data 
are being withheld. In October last year, the 
chair of the Committee on Science, Space, and 
Technology of the US House of Representa- 
tives, a long-term critic of climate scientists, 
subpoenaed data from a federal agency that 
were already publicly available on the Internet 
(see go.nature.com/p4tmjd). 

Protective action. We strongly support 
open data’, and scientists should not regard 
all requests for data as harassment. 

When researchers cannot share data, 
they should explain why. Valid reasons may 
include confidentiality issues with clini- 
cal data, and cases in which participants’ 
consent did not explicitly encompass data 
sharing. Researchers also need control over 
how data is to be used ifit goes beyond what 
participants agreed to (for example, analysis 
of ethnic, race or gender differences in data 
collected for different purposes). The status 
of data availability should be enshrined in the 
publication record along with details about 
what information has been withheld and why. 
Some journals and publishers are already 
moving towards this practice (for example, 


TEN RED FLAGS 


Dr A publishes a study showing that food X increases the risk of disease Y. Critics accuse her of 
incompetence, scaremongering and ethical violations. Do these accusations constitute harassment or 


healthy debate? 
Raises red flags about researcher 
Expertise Does Dr A's contested work fall outside 
her training or her previous publications? 
Conflicts s Dr A funded by competitors of X? Is she 


Communication 


Errors 


Balance 


Scholarship 


Transparency 


marketing an antidote for Y? 


Did Dr A promote this work without 
publishing it in a peer-reviewed journal? 


Does Dr A have a track record of major 
errors? Has she been defensive about 
minor errors? 


Does Dr A have a record of 
misrepresenting evidence? Does she 
dismiss counter-arguments? 


Are results out of line with existing, 
reputable scholarship, if it exists? 


Has Dr A refused to make data available? 
Has she ignored reasonable disclosure 


Raises red flags about critics 


Are the critics operating outside their 
area of apparent expertise? Do the critics 
refuse to engage with the peer-reviewed 
literature? 


Do the critics have a financial interest in 
the results? 


Do the critics attack all researchers who 
show that X is harmful? 


Do the critics use small errors to dismiss 
all of Dr A’s work? 


Do the critics have a record of cherry- 
picking evidence in public statements? 


Can the critics specify what they would 
regard as convincing evidence? 


Are the critics making showy demands 
for already-public data, or for data for 


standards? 


Track record 
without peer review? 


Insults or libel Does Dr A uniformly dismiss critics as 

ignorant, biased or conflicted? 
Freedom-of- Does Dr A claim that funding sources are 
information irrelevant? Has she erected barricades to 
requests disclosure? 


PLOS and some journals published by the 
Association for Psychological Science, includ- 
ing Psychological Science). Calls for a data set 
that ignore its open availability (including 
limitations agreed on during publication, 
where applicable) could suggest harassment. 

We suspect that explicit discussion of 
what data are and are not available as part 
of the original publication process might 
have averted some of the ongoing contro- 
versy surrounding the PACE clinical trial, a 
UK study on chronic fatigue syndrome. The 
issue involves requests for data by transpar- 
ency advocates, and the refusal by research- 
ers and institutions to release data citing 
patient confidentiality, limited consent and 
requestors intent. 

Even when data availability is described in 
papers, tension may still arise if researchers 
do not trust the good faith of those requesting 
data, and if they suspect that requestors will 
cherry-pick data to discredit reasonable con- 
clusions. Research is already moving towards 
study ‘pre-registration (researchers pub- 
lishing their intended method and analysis 
plans before starting) as a way to avoid bias, 
and the same strictures should apply to crit- 
ics during reanalysis. In general, critics and 
original researchers should obey symmetri- 
cal standards of openness and responsibil- 
ity and be subject to symmetrical scrutiny 


460 | NATURE | VOL 529 | 28 JANUARY 2016 


© 2016 Macmillan Publishers Limited. All rights reserved 


Has Dr A routinely promoted flashy work 


which patients have not consented to 
publication? 


Do the critics attack scientists across 
disciplines on different topics? Do they 
have a track record of harassment or 
vexatious complaints? 


Are the critics levelling personal attacks? 
Are criticisms from anonymous sources 
or ‘sock puppets’? 


Do the critics use freedom-of-information 
requests for private correspondence 
unrelated to funding? 


concerning conflicts of interest®. In cases in 
which researchers have no confidence in the 
good faith of the people requesting data, one 
potential solution would be arbitration by an 
independent adjudicator. 


Social media: rapid correction versus mob 
rule. Blogs and social media enable rapid cor- 
rection of science by scientists, as shown by 
the ‘arsenic life’ controversy in 2012, in which 
initial claims of a startling finding — that a 
bacterium could survive without phosphorus 
by substituting arsenic in its place in essen- 
tial biomolecules — were rapidly rebutted 
by experts online (see Nature http://doi.org/ 
fx24wg; 2012). Yet social media and online 
comments also offer an easy way to inject 
biased, incorrect or misleading information. 
And because engagement with critics is a core 
element of scientific practice, researchers may 
feel obliged to respond even to ‘rolls’ (online 
harassers). 

Protective action. Scientists should ignore 
critics who are abusive or illogical and those 
that make the same points repeatedly despite 
rebuttals. Internet trolling has been associ- 
ated with sadism and psychopathy”. Engage- 
ment with such bad-faith actors can imperil 
scientists’ well-being in a way that university 
ethics committees would never condone in 
research on human subjects. 


STEVE HELBER/AP/PA 


All who participate in post-publication 
review should identify themselves. The 
drawbacks of anonymity (its encourage- 
ment of bad behaviour) outweigh its advan- 
tages (for example, it allows junior people 
to criticize senior academics without fear of 
redress). What’s more, the scientific commu- 
nity should not indulge in games of ‘gotcha 
(intentionally turning small errors against 
a person). Minor corrections and clarifica- 
tions after publication should not be a reason 
to stigmatize fellow researchers. Scientific 
publications should be seen as ‘living docu- 
ments, with corrigenda an accepted — if 
unwelcome — part of scientific progress. 


Freedom-of-information requests: right 
to know versus right to privacy. Freedom- 
of-information (FOI) requests have revealed 
conflicts of interest, including undisclosed 
funding of scientists by corporate interests 
such as pharmaceutical companies and utili- 
ties. But information requests have also been 
used as harassment, in attempts to embar- 
rass researchers or just to waste their time. In 
2010, the then-attorney-general of Virginia 
sought to obtain private e-mail correspond- 
ence from climate scientist Michael Mann, 
relating to work undertaken while he was at 
the University of Virginia in Charlottesville. 
This request, widely seen as a witch-hunt (see 
Nature 465, 135-136; 2010), was ultimately 
struck down by the Virginia high court. 
Protective action. Given that contempo- 
rary conversations are mainly conducted 
by e-mail, broad-ranging FOI laws risk 
being tantamount to permanent wiretaps 
in academics’ offices. We fear that with- 
out the guarantee of privacy during e-mail 
conversations, self-censorship will have 
chilling effects on academic freedom and 
incisive discussion. A 2013 decision of the 
UK information commissioner towards 
preserving researchers’ rights against dis- 
closing “material which is still in the course 
of completion, to unfinished documents or 
to incomplete data” are encouraging, and 
cogent guidelines are beginning to emerge. 
However, the right to privacy should not 
extend to funding arrangements”. Research- 
ers should scrupulously disclose all sources of 


FIVE DOUBLE-EDGED TOOLS 


Climate scientist Michael Mann was hounded 
for private e-mails. 


funding; even small undisclosed amounts can 
create an impression of undue influence, as 
in a2015 case involving a US researcher who 
was working on genetically modified crops 
and had received US$25,000 from Monsanto 
to assist his outreach efforts (see Nature 524, 
145-146; 2015). FOI requests can be an 
appropriate tool in cases involving the con- 
flation of public money and private interests. 


Calls for retraction: correction versus 
censorship. Publication retractions have 
historically been reserved for cases of fraud 
or grave errors. Increasingly, however, calls 
for retraction are coming from people who 
do not like a paper’s conclusions. In one 
famous case, a committee created by the 
National Football League called for a journal 
to retract an article by a medical researcher 
who argued that severe brain damage in a 
deceased American-football player had 
probably resulted from repeated concus- 
sions. (These conclusions were eventually 
endorsed by independent researchers.) 

Protective action. Journals and profes- 
sional societies should condemn specious 
calls for retraction. Journals and institutions 
can also publish threats of litigation, and use 
sunlight as a disinfectant. 


Legitimate tools of scholarly exchange can be weaponized. 


Technique Use 


Call for data 
analyses. 


Social-media posts 
practices. 


Freedom-of-information 
requests 


Call for retraction 
from the literature. 


Complaints to universities 


Permit the replication or inspection of 


Highlight errors or questionable 


Reveal hidden conflicts of interest. 


Remove unethical or erroneous work 


Redress unethical conduct. 


Abuse 


Impugn scientists’ integrity (when 
data is already available); biased 
re-analyses. 


Stalk, libel, intimidate or harass. 
Launch a fishing expedition into 
private correspondence. 


Discredit inconvenient results. 


Damage reputation. 
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Institutional self-scrutiny versus 
protection from harassment. Universi- 
ties have complaint processes for good 
reasons. However, complaints are also used 
to undermine researchers doing legitimate 
but controversial science’. 

Protective action. Scientists who are 
harassed often feel alone. Universities do not 
tolerate harassment based on race or gender, 
and neither should they tolerate harassment 
based on contentious science. They should 
provide training and support to help their 
researchers cope. Public declarations can be 
particularly useful: in 2014, in response to 
the harassment of one of its professors, the 
Rochester Institute of Technology in New 
York publicly acknowledged the scientific 
consensus on climate change and its support 
for academic freedom. 


NEXT STEPS 

Numerous professional bodies, educational 
institutions, government agencies and 
journals have convened meetings during 
the past few years to put science under the 
microscope. Issues such as reproducibility 
and conflicts of interest have legitimately 
attracted much scrutiny and have stimu- 
lated corrective action. As a result, the field is 
being invigorated by initiatives such as study 
pre-registration and open data. 

Similar attention must be devoted to 
stressors and threats to science that arise in 
response to research that is considered incon- 
venient. The same institutions and bodies 
that have scrutinized science must also start 
a conversation about how to protect it. m 


Stephan Lewandowsky is professor in 
cognitive psychology at the University of 
Bristol, UK, who focuses on the public 
understanding of science. Dorothy 

Bishop is professor of developmental 
neuropsychology at the University of 
Oxford, UK; she chaired a symposium at the 
Wellcome Trust in London in April 2015 on 
improving scientific reliability. 

e-mails: stephan.lewandowsky@bristol.ac.uk; 
dorothy. bishop @psy.ox.ac.uk 
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Richard Dawkins in 1976, around the time he published his first best-selling book. 


The Selfish Gene 


Matt Ridley reassesses Richard Dawkins’s pivotal 
reframing of evolution, 40 years on. 
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ooks about science tend to fall into 
B two categories: those that explain it 

to lay people in the hope of cultivat- 
ing a wide readership, and those that try to 
persuade fellow scientists to support a new 
theory, usually with equations. Books that 
achieve both — changing science and reach- 
ing the public — are rare. Charles Darwin's 
On the Origin of Species (1859) was one. The 
Selfish Gene by Richard Dawkins is another. 
From the moment of its publication 40 years 
ago, it has been a sparkling best-seller and a 
scientific game-changer. 

The gene-centred view of evolution that 
Dawkins championed and crystallized is 
now central both to evolutionary theoriz- 
ing and to lay commentaries on natural 
history such as wildlife documentaries. A 
bird or a bee risks its life and health to bring 
its offspring into the world not to help itself, 
and certainly not to help its species — the 
prevailing, lazy thinking of the 1960s, even 
among luminaries of evolution such as Julian 
Huxley and Konrad Lorenz — but (uncon- 
sciously) so that its genes go on. Genes that 
cause birds and bees to breed survive at the 
expense of other genes. No other explana- 
tion makes sense, although some insist that 
there are other ways to tell the story (see 
K. Laland et al. Nature 514, 161-164; 2014). 

What stood out was Dawkins’s radical 
insistence that the digital information in 
a gene is effectively immortal and must be 
the primary unit of selection. No other unit 
shows such persistence — not chromosomes, 
not individuals, not groups and not species. 
These are ephemeral vehicles for genes, just 
as rowing boats are vehicles for the talents of 
rowers (his analogy). 

As an example of how the book changed 
science as well as explained it, a throwaway 
remark by Dawkins led to an entirely new 
theory in genomics. In the third chapter, he 
raised the then-new conundrum of excess 
DNA. It was dawning on molecular biolo- 
gists that humans possessed 30-50 times 
more DNA than they needed for protein- 
coding genes; some species, such as lung- 
fish, had even more. About the usefulness 
of this “apparently surplus DNA’, Dawkins 
wrote that “from the point of view of the 
selfish genes themselves there is no para- 
dox. The true ‘purpose’ of DNA is to survive, 
no more and no less. The simplest way to 
explain the surplus DNA is to suppose that 
itis a parasite.” 

Four years later, two pairs of scientists 
published papers in Nature formally set- 
ting out this theory of “selfish DNA’, and 
acknowledged Dawkins as their inspira- 
tion (L. E. Orgel and F. H. C. Crick Nature 
284, 604-607 (1980); 
W. F. Doolittle and 


The Selfish Gene : 

RICHARD DAWKINS C. Sapienza Nature 
Oxford University 284, 601-603; 1980). 
Press: 1976. Since then, Dawkins’s 
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speculation has NATURE.COM 
been borne out by  Forareviewof 
the discovery that Richard Dawkins’s 


latest memoirs, see: 
go.nature.com/cqukcg 


much surplus DNA 
consists of reverse 
transcriptase — a 
viral enzyme whose job is to spread copies 
of itself — or simplified versions of trans- 
posons dependent on it. Thus, Dawkins’s 
ideas helped to explain what was going on 
inside genomes, as well as between individu- 
als, even though the book was written long 
before DNA sequencing became routine. 
The complexity of the structure of the gene 
itself has since grown enormously, with the 
discovery of introns, control sequences, 
RNA genes, alternative splicing and more. 
But the essential idea of a gene as a unit of 
heritable information remains, and Dawk- 
ins’s synthesis stands to this day. 

On The Selfish Gene’s 30th anniversary, 
many of Dawkins’s admirers, including 
writer Philip Pullman and cognitive scien- 
tist Steven Pinker, contributed essays to the 
book Richard Dawkins (Oxford University 
Press, 2006) edited by his former students 
Alan Grafen and Mark Ridley (no relation 
of mine). In this Festschrift, the philosopher 
Daniel Dennett argued that the book was not 
just science, but “philosophy at its best”. In 
my contribution, I pointed out that the suc- 
cess of the book had spawned a gold rush 
for popular-science writers, as publishers 
began offering large advances in the hope of 
finding the next Selfish Gene. James Gleick’s 
Chaos (Abacus, 1988), Stephen Hawking’s 
A Brief History of Time (Bantam, 1988) and 
Pinker’s The Language Instinct (William 
Morrow, 1994) were among the nuggets 
mined before the boom petered out. 

Although his book brimmed with original 
thoughts, Dawkins was quick to acknowl- 
edge that he was building on the discov- 
eries and insights of others, notably the 
evolutionary theorists William Hamilton, 
George Williams, John Maynard Smith and 
Robert Trivers. They were equally quick to 
appreciate that he had done something more 
than explain their ideas. Trivers wrote the 
foreword, and Maynard Smith narrated a 
television documentary about the book 
soon after it was published. Williams said 
in an interview that Dawkins’s book had 
“advanced things a lot further than mine 
did” (see go.nature.com/21j1mt); Hamilton 
wrote that The Selfish Gene “succeeds in the 
seemingly impossible task of using simple, 
untechnical English to present some rather 
recondite and quasi-mathematical themes 
of recent evolutionary thought” in a way 
that would “surprise and refresh even many 
research biologists” (W. D. Hamilton Science 
196, 757-759; 1977). 

As a first-year undergraduate in the zool- 
ogy department at the University of Oxford, 
UK, where Dawkins was about to teach me 


Dawkins speaking at an atheist event in 2012. 


computing and animal behaviour, I found 
the book exhilarating and bewildering. Until 
then, my teachers had helpfully divided 
the world into right ideas and wrong ones. 
But here was a writer turning some settled 
science upside down and inviting me to join 
him on a journey to discover a truth that 
seemed to him “stranger than fiction”. Was 
he right or wrong? I was being shown the 
arguments, not the answers. 

The origin of The Selfish Gene is intriguing. 
Dawkins revealed in the first volume of his 
memoirs, An Appetite for Wonder (Bantam, 
2013; see E. Scott Nature 501, 163; 2013), 

that the idea of self- 


“Dawkins’s ish genes was born 
ideas helped to ten years before 
explain what the book was pub- 
was going on lished. In 1966, the 
inside genomes Dutch biologist 
long beforeDNA Niko Tinbergen 
sequencing asked D awkins, 
becameroutine.” then a research 


assistant with a new 
doctorate in animal behaviour, to give some 
lectures in his stead. Inspired by Hamilton, 
Dawkins wrote in his notes (reproduced in 
An Appetite for Wonder): “Genes are in a 
sense immortal. They pass through the gen- 
erations, reshuffling themselves each time 
they pass from parent to offspring ... Natu- 
ral selection will favour those genes which 
build themselves a body which is most likely 
to succeed in handing down safely to the next 
generation a large number of replicas of those 
genes ... our basic expectation on the basis of 
the orthodox, neo-Darwinian theory of evo- 
lution is that Genes will be ‘selfish” 
Dawkins began writing the book in 1973, 
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and resumed it in 1975 while on sabbatical. 
At the suggestion of Desmond Morris, the 
zoologist and author of The Naked Ape 
(Jonathan Cape, 1967), Dawkins showed 
some draft chapters to Tom Maschler of 
Jonathan Cape, who strongly urged that the 
title be changed to “The Immortal Gene’ 
Today, Dawkins regrets not taking the 
advice. It might have short-circuited the 
endless arguments, so beloved of his critics 
and so redolent of the intentional stance (in 
which we tend to impute mental abilities to 
unconscious things, from thunderstorms to 
plants), about whether selfishness need be 
conscious. It might even have avoided the 
common misconception that Dawkins was 
advocating individual selfishness. 

In the end, it was Michael Rodgers of 
Oxford University Press who enthusiastically 
published The Selfish Gene, after demand- 
ing “I must have that book!” when he saw 
early draft chapters. It was an immediate 
success, garnering more than 100 reviews, 
mostly positive. Dawkins went on to write 
books that were better in certain ways. The 
Extended Phenotype was more groundbreak- 
ing, The Blind Watchmaker more persuasive, 
Climbing Mount Improbable more logical, 
River out of Eden and Unweaving the Rain- 
bow mote lyrical, The Ancestor’s Tale more 
encyclopaedic, The God Delusion more con- 
troversial. But they were all variations on the 
themes he so eloquently and adventurously 
set out in The Selfish Gene. m 


Matt Ridley’s latest book is The Evolution 
of Everything. He is a columnist for The 
Times. 

Twitter: @mattwridley 
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Pop-up diagrams in a 1570 translation of Euclid’s Elements, for which John Dee wrote the preface. 


Archive of wonders 


Philip Ball browses remnants of the celebrated library 
of mathematician and occultist John Dee. 


efore Galileo, it is difficult to tell the 
Bises of science as such. No one 

exemplifies that better than John Dee 
(1527-1609). The Tudor scholar was one of 
the most respected mathematicians of his 
day, advising explorers on navigation and 
fashioning ingenious mechanical devices for 
the theatre. He also cast horoscopes, stood 
accused of witchcraft, collected books of 
magic and professed to converse with angels. 
“Scholar, courtier, magician” is how he is 
described in an exhibition at the Royal Col- 
lege of Physicians (RCP) in London. 

The show explores Dee's identity through 
the medium of his legendary library. At his 
home in Mortlake, west of London, Dee 
amassed perhaps the finest English library 
of his time, on topics from astronomy and 
alchemy to Greek poetry. Ultimately, the exhi- 
bition concedes that it is impossible either to 
sum up Dee’ activities and interests under 
a single rubric or to understand them at all 
within the modern boundaries of science. 

This is probably why Dee has enjoyed so 
many reincarnations in popular culture. 
Unlike the more clearly delineated Isaac 
Newton, we can refract him through the pre- 
occupations of the age. Allegedly, he was the 
archetype for the magus Prospero in William 
Shakespeare's 1610 The Tempest, and he is said 
to have advised on the Globe Theatre's design. 
By 1659, when Dee’s transcripts of angelic 
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conversations were posthumously published 
(after being exhumed; he had buried them 
in his garden), his work informed the debate 
about the reality of angels and demons. That 
was of interest a few years later to natural phi- 
losopher Robert Boyle, who believed that the 
existence of spirits vindicated religious ideas 
and undermined atheism. Boyle's colleague 
Robert Hooke claimed — unconvincingly 
— that Dee's angelic discourse contained 
encrypted intelligence for the court. 

In 1806, these activities were adduced ina 
book on insanity and mass hysteria by Eng- 
lish physician Thomas Arnold: supernatural 
phenomena now fell into the nascent realm 
of psychology, and Dee was seen asa deluded 
fanatic. A marvellous late-nineteenth-century 
painting by Henry Gillard Glindoni shows 
Dee conjuring with chemicals and 
fire before Queen Elizabeth I; 
X-rays show that he originally 
stood ina circle of skulls, later 
painted over. In recent years, 
Dee has been conscripted for 
the psychogeography of nov- 
elist Peter Ackroyd (The 4 


John Dee pictured in 


an engraving made tf is Ag 
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Scholar, Courtier, House of Doctor Dee; 

Magician: The Lost Hamish Hamilton, 

Library ofJohnDee 1993) and postmodern 

oy aa noe fantasies of Albion in 

18 Januaty = 29 July Damon Albarn’s 2011 
opera Dr Dee. 


The RCP holds the largest remaining col- 
lection of Dee's books. Unsure of the English 
queen's favour, Dee left the country in 1583 
under the patronage of disreputable Polish 
prince Olbracht Laski, entrusting his library 
to his brother-in-law, Nicholas Fromond. It 
was a bad choice: Fromond allowed Dee's 
‘friends’ and former pupils to carry off or buy 
the books. Dee returned, his fortunes hav- 
ing soured, to find most of his books gone. 
One energetic thief was Nicholas Saunder, 
possibly a former pupil: around 160 books 
that he filched ended up in the collection of 
Henry Pierrepont, the Marquis of Dorches- 
ter. They were among the 3,000 or so volumes 
bequeathed to the RCP around 1680, and sev- 
eral of them, supplemented by other books 
of Dee’s from elsewhere, are now on display. 
They opena window on Dee’ broad interests. 

There are volumes on mining; Dee advised 
naval expeditions for voyages in search of 
far-flung gold to enrich Elizabeth's “British 
Empire” (a term that first appears in Dee’s 
1577 Perfect Arte of Navigation). He wrote 
the preface to a 1570 English translation of 
Euclid’s Elements, complete with assemble- 
yourself pop-up figures of polyhedra and 
intersecting planes. Here, he defended math- 
ematics against charges of witchcraft and 
explained its use in developing mechanical 
inventions and technologies. To Dee, math- 
ematics was an art both mystical and practi- 
cal: the key both to cosmic harmony and to 
something approaching quantitative science. 

He read about navigation (and doodled 
a splendid galleon in the margin of Cicero's 
Opera), made notes on the weather, devised 
horoscopes in the margins of Girolamo 
Cardano’s works on astrology. (He was 
arrested in 1555 on suspicion of illegally 
casting the horoscope of the reigning Catho- 
lic queen Mary, Elizabeth’s predecessor.) 
Also on display is a fine collection of Dee's 
instruments, including an obsidian “magical 
mirror” brought from the New World and 
the original crystal ball in which his ‘scryer 
Edward Kelley claimed to see angels. 

Does all this help us to make sense of Dee 
and where he fits (if at all) into the narrative 
of science? Rather, the exhibition shows us 
that the story has arbitrary boundaries in the 
autumn of the Renaissance, when scholars 
were unconstrained by disciplinary distinc- 
tions, and magic and marvels were 
still part of the rational cosmos. = 


Philip Ball is a freelance 
writer. His latest book is 
Invisible. 

e-mail: p.ball@btinternet.com 
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Concern grows for 
Turkey’s academics 


We strongly urge the Turkish 
government to stop prosecuting 
academics, to abide by 
international human-rights 
values and to respect civil 
liberties — including freedom 
of speech (see Nature http://doi. 
org/bbxj; 2016). 

Ina petition to the government 
this month, more than 
2,000 academics from Turkey 
and thousands of international 
scholars have called for an end to 
the curfews and violence against 
people in Kurdish provinces. This 
prompted Turkey’s President 
Recep Tayyip Erdogan to order 
the Higher Education Board 
to take action against those 
academics he described as 
committing “treason”. Istanbul's 
Chief Public Prosecutor launched 
a criminal investigation based on 
Article 301 of the Turkish Penal 
Code, which prosecutes those 
who insult the state. 

We are deeply concerned 
about this escalating crisis. We 
hope that the international 
academic community will join 
us in condemning these attacks 
against our colleagues in Turkey. 
Caghan Kizil* German Centre 
for Neurodegenerative Diseases, 
Helmholtz Association, Dresden, 
Germany. 
caghan.kizil@crt-dresden.de 
*Supported by 14 signatories 
(listed at go.nature.com/jy8ol6). 


Synthesize evidence 
to steer decisions 


Using evidence mapping 

to display and categorize 
environmental studies cannot 
replace ‘evidence synthesis’ 

in guiding decision-making 
(M.C. McKinnon et al. Nature 
528, 185-187; 2015). There are 
no shortcuts to evidence-based 
practice. 

The results of investigations 
need to be synthesized to allow 
conclusions to be drawn from 
contradicting data (L. V. Dicks 
et al. Trends Ecol. Evol. 29, 


607-613; 2014). Studies can be 
assigned a ‘level of evidence 
indicator of design and quality, 
which is derived from evidence 
hierarchies (see, for example, 
A.-C. Mupepele et al. Ecol. Appl. 
http://dx.doi.org/10.1890/15- 
0595.1; 2016). This indicator 
reflects the confidence with 
which the reported outcome 
can be causally attributed to the 
investigated driver. 
Practitioners’ questions are 
rarely answered directly by an 
existing set of studies. Evidence- 
based medicine tackles this 
problem by developing clinical 
guidelines on the basis of 
collated scientific results and 
clinical experience, and by using 
systematic reviews of research 
results and evidence assessments 


that are supported by hierarchies. 


Anne-Christine Mupepele, 
Carsten F. Dormann University 
of Freiburg, Germany. 
anne-christine.mupepele@biom. 
uni-freiburg.de 


Hold atmosphere 
in trust for all 


We, the undersigned, call on the 
V20 — the 20 countries that are 
most vulnerable to the effects 
of climate change — to take the 
lead in creating an ‘atmospheric 
trust’ that establishes 
community property rights 
over the atmospheric commons 
(www.claimthesky.org). The 
V20 could use this trust as a legal 
instrument to address the climate 
crisis and to help to implement 
last month's Paris agreement to 
keep warming well below 2°C. 

Under public-trust doctrine, 
certain natural resources such 
as soil and water must be held in 
trust to serve the public good. It is 
every government's responsibility 
as a trustee to protect these assets 
as natural capital and to maintain 
them for the public’s use, not give 
them away or sell them to private 
parties. The global atmosphere is 
one such asset. 

An atmospheric trust would 
act as an independent agency 
and trustee. It could collect 
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claims for damages to the 
atmosphere and invest funds 

in mitigation, adaptation and 
compensation, and in resources 
for the most affected populations. 
Because only 90 enterprises 
(mainly extractive industries) 
are responsible for two-thirds 
of global carbon emissions 

(R. Heede Clim. Change 122, 
229-241; 2014), damage claims 
could target a relatively small 
number of entities. 

All governments would 
eventually be co-trustees in the 
atmospheric commons, with a 
fiduciary responsibility to protect 
it from catastrophic releases of 
greenhouse gases. 

Robert Costanza* The 
Australian National University, 
Acton, Australia. 
robert.costanza@anu.edu.au 

*On behalf of 31 correspondents (see 
go.nature.com/52f8mt for full list). 


What stops women 
getting more grants? 


Women make up 33% of the 
applicants who are eligible for 
programmes funded by the UK 
Biotechnology and Biological 
Sciences Research Council 
(BBSRC), but they lead only 

21% of grant applications. The 
percentage receiving large 

grants of more than £2 million 
(US$2.8 million) remains 
stubbornly low: in 2014, 

women had a success rate of 

17% compared with 44% for men. 
To investigate this, we informally 
surveyed focus groups from seven 
BBSRC-funded universities (see 
go.nature.com/wqrfz3). 

All groups cited society’s 
expectations of professional 
women and (unconscious) biases 
against them. They also specified 
the way in which science as a 
profession organizes itself and 
how esteem is rewarded; and 
dominant behaviours by full- 
time researchers (primarily men) 
that seem to attract support at 
the expense of more-junior, part- 
time or flexitime researchers 
(primarily women). There 
were perceived inconsistencies 


in the grant-award process, 

including in the quality and 

tone of reviewer comments 

and committee feedback, 

and concerns about gender 

imbalance in the reviewer pool. 
The BBSRC is working with its 

research communities to address 

these issues. We welcome 

suggestions that could help us 

to achieve a more diverse and 

inclusive research community 

(see also M. Urry Nature 528, 

471-473; 2015). 

David McAllister, Jan Juillerat, 

Jackie Hunter BBSRC, Swindon, 

UK. 

david.mcallister@bbsrc.ac.uk 


Solar energy 
needs focus 


The high cost of solar 
photovoltaic installations 
prevents them from providing 
more than about 1% of the 
world’s electricity requirement. A 
solution would be to incorporate 
an optical concentrator in the 
solar photovoltaic module 

that would save on expensive 
materials without compromising 
electrical output. 

Optical concentrators focus 
solar energy on a small area 
attached to a photovoltaic cell 
(P. Gleckman et al. Nature 339, 
198-200; 1989). However, this 
technology has been held back by 
its complex manufacturing and 
assembly processes, its modest 
electrical-conversion efficiency 
and a lack of government 
funding and policy. 

Researchers, industries and 
governments must work together 
to resolve the technical issues 
associated with this promising 
technology and come up witha 
practical, industry-ready design 
to revive the solar energy market. 
Abu Bakar Munir University of 
Malaya, Malaysia. 

Firdaus Muhammad-Sukki 
Robert Gordon University, 
Aberdeen, UK; and Multimedia 
University, Selangor, Malaysia. 
Nurul Aini Bani Universiti 
Teknologi Malaysia, Malaysia. 
fb.muhammad-sukki@rgu.ac.uk 


NEWS & VIEWS 


For News & Views online, go to 
nature.com/newsandviews 


The domestication of Cas9 


The enzyme Cas9 is used in genome editing to cut selected DNA sequences, but it also creates breaks at off-target sites. 
Protein engineering has now been used to make Cas9 enzymes that have minimal off-target effects. SEE ARTICLE P.490 


FYODOR URNOV 


or the past 30,000 years, 
2 humans have been genetically 

engineering the wolf through 
selective breeding, preserving some 
forms of genes and eliminating 
others to produce the dog. Now, 
two studies (one on page 490 of this 
issue’ and one in Science’) have used 
genetic engineering to tame a differ- 
ent type of wild creature — a nuclease 
enzyme called Cas9. In doing so, they 
have markedly reduced the enzyme'’s 
undesirable natural tendencies, but 
have preserved its ability to cut DNA 
in an RNA-guided manner. This feat 
of molecular domestication is great 
news for practitioners of gnome 
editing, in which DNA sequences 
in cells or organisms are changed to 
scientists’ specifications efficiently 
and accurately; such precise editing 
requires highly targetable nucleases’. 

In its natural ‘wild’ state, Cas9 is 
part of the bacterial immune sys- 
tem. When a bacterium is infected by a para- 
site such as a virus, the organisms cellular 
machinery cuts up and retains pieces of the 
invader’s DNA, storing the sequences in a 
region of the bacterium’s own genome called 
a CRISPR locus’. Cas9 then polices the bacte- 
rium for repeat invaders by carrying with it an 
RNA copy of a sequence stored in the CRISPR 
locus (for simplicity, this RNA is referred to 
here as a guide RNA, or gRNA). The enzyme 
compares intracellular DNA to the sequence in 
the gRNA, and if there is a match, Cas9 cuts the 
invading DNA. Attackers evade detection by 
changing their DNA sequence, so Cas9 evolved 
to cut incoming DNA even if its sequence is a 
less-than-perfect match to the gRNA. 

Studies of this and other bacterial defence 
mechanisms have had a major impact on 
genome editing. In this process, a nuclease cuts 
DNA inside the cell and, as this DNA break 
is being repaired, the desired edit (disruption, 
correction or insertion of a gene) takes place*>. 
The first genome-editing experiments made 
use of another class of nuclease, zinc-finger 
nucleases (ZFNs)’, but the discovery that Cas9 
is led by a gRNA dramatically expanded the 
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Figure 1 | Taming a wild enzyme. The Cas9 enzyme cuts specific 
DNA sequences, which it identifies using a guide RNA (gRNA) that 
pairs with the chosen sequence in an unwound DNA double helix. 
Kleinstiver et al.' engineered Cas9 such that interactions between the 
enzyme and the backbone of the gRNA-paired DNA were weakened. 
Slaymaker et al.’ engineered the contacts that the enzyme makes with 
the complementary DNA single strand, which is not recognized by the 
gRNA. These modifications forced the engineered enzymes to rely toa 
greater extent on the gRNA for sequence recognition, thus improving 
their binding specificity. (Adapted from ref. 1.) 


scale and scope of genome-editing applications 
for research purposes®, because the gRNA 
enables easy and relatively efficient enzyme 
programming. 

Cas9 evolved to defend a bacterium that 
has a genome 1,800 times smaller than the 
human genome, and cutting DNA sequences 
that are imperfectly matched to the gRNA is an 
adaptation to its natural battlefield. As a con- 
sequence, when Cas9 was brought in from the 
wild and placed in human cells, it introduced 
genetic changes to unintended stretches of 
DNA in addition to editing the gene of inter- 
est’. Imagine a short-sighted witness to a crime 
attempting to identify the perpetrator in a 
police line-up, relying not only on the facial 
features that make the criminal unique but 
also on those shared with other people, such 
as gender or height. This could lead to a case 
of mistaken identity, because a weak match to 
the witness's fuzzy mental image could be rein- 
forced by a match on shared features. Similarly, 
wild-type Cas9 finds its target not only using 
the sequence-specific gRNA, but also by grasp- 
ing onto the DNA backbone, which is the same 
in any gene. 
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In the present studies, Kleinstiver 
et al.' and Slaymaker et al.” set out 
to tame Cas9 using a thoughtful and 
well-executed approach that relied 
on an atom-by-atom understanding 
of how the enzyme binds to and cuts 
DNA’. (A similar study of how zinc 
fingers bind DNA’ ultimately pro- 
vided the basis for the first genome- 
editing experiments.) The groups 
reasoned that, by engineering Cas9 
such that its interactions with the 
DNA backbone were weakened, 
they could force the enzyme to rely 
to a greater extent on the gRNA- 
DNA pairing to recognize and cut its 
target (Fig. 1). 

Kleinstiver and colleagues tested 
the editing specificity of their result- 
ing enzyme, which they dubbed 
high-fidelity Cas9 (Cas9-HF), in a 
cancer-cell line. They programmed 
Cas9-HF with gRNAs for seven dif- 
ferent stretches of human DNA: in 
six cases it edited only the intended 
target, and in the remaining stretch 
the enzyme was weakly distracted by only one 
other position in the DNA. By comparison, 
wild-type Cas9 cut at multiple unintended 
sequences when tested with gRNAs for these 
seven gene targets. Crucially, for 75% of targets 
tested, Cas9-HF was just as potent a genome 
editor as its wild ancestor. Slaymaker and 
colleagues followed the same overall principle to 
produce an enzyme that they called enhanced 
specificity Cas9 (eCas9), although the details of 
their engineering and analyses differed. 

These ‘domesticated’ Cas9 enzymes are sure 
to be used by laboratories the world over. The 
immediate impact will be to shorten the time 
it takes to complete a genome-editing experi- 
ment, because the need to check for undesired 
edits will be reduced. Cas9 has been used to 
systematically scan many human genes at a 
time for those that underlie a trait of inter- 
est’’, and such experiments will now be more 
efficient. In agriculture — especially in species 
that have extended life cycles, such as crops 
or cattle — the use of enhanced Cas9 might 
obviate the need to perform time-consuming 
crosses to obtain a pristinely edited organism. 

Genome editing was first applied in the clinic 


in 2009, when an ex vivo ZFN-based approach 
was used to edit certain immune cells from 
people with HIV”. This approach has since 
been used to treat more than 80 patients, and 
has a good safety record. Last December, the 
first clinical trial of in vivo gene editing, a ZFN- 
based approach for treating haemophilia”, 
passed review by the US Food and Drug 
Administration. ZFNs have already been engi- 
neered toa level of specificity that is comparable 
to that of Cas9-HF (see go.nature.com/mkl6v1), 
and they have passed regulatory hurdles 
for use in clinical trials in both in vivo and 
ex vivo applications. The current studies inspire 


confidence that the scope of clinical genome 
editing will continue to expand. Advances 
in this field offer the promise of engineering 
genetic cures for many diseases — a prospect 
that is both encouraging and within our reach. m 


Fyodor Urnov is at Sangamo BioSciences, 
Richmond, California 94804, USA. 
e-mail: furnov@sangamo.com 
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Fluorescent boost 
for voltage sensors 


The development of a voltage sensor in which a microbial rhodopsin protein is 
fused with a fluorescent protein enables the neuronal activity of single cells in 
live animals to be measured with unprecedented speed and accuracy. 


VIVIANA GRADINARU 
& NICHOLAS C. FLYTZANIS 


ver the past few decades, scientific 

discoveries have greatly deepened 

our understanding of biology, but our 
ability to understand the workings of our own 
minds has proved a frustrating exception. It 
is therefore no surprise that a major aim of 
neuroscience is to develop tools, in particu- 
lar light-based technologies, with which to 
deconstruct brain function and dysfunction 
by controlling and recording brain activity’. 
Writing in Science, Gong et al.” report one of 
the best tools so far for the optical monitoring 
of neuronal activity — a strongly fluorescent, 
fast-acting, voltage-sensing protein that can be 
used to image subcellular activity changes in 
model organisms. 

The brain uses complex electrical signalling 
that is orders of magnitude more efficient than 
the fastest computer to react to external and 
internal stimuli, processing these inputs and 
outputting behaviours. The action potentials 
that make up this signalling are caused by the 
opening and closing of protein channels in the 
cell membrane through which ions can flow 
into or out of the cell. Changes in relative ion 
concentrations alter the voltage across the cell 
membrane, causing rapid propagation of elec- 
trical currents down the length of the neuron 
and eventual signalling to other neurons down- 
stream. To decode this neuronal language, we 
need tools that can track electrical activity from 
neuron to neuron across the brain. 

Genetically encoded calcium indicators 


(GECIs)’*, which fluoresce in response to 
the calcium-ion influx that is triggered by 
neuronal activity, are used as standard for 
tracking electrical activity in neurons in vivo. 
These sensor proteins can be used to monitor 
both population-wide and single-cell activity 
in chosen cell types. However, influx of 
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calcium ions is an indirect and slow measure 
of action potentials and cannot capture all 
smaller, subthreshold events — changes in 
membrane voltage that are not large enough 
to trigger an action potential, but that can 
nonetheless affect brain physiology. 

An alternative approach involves fluorescent 
proteins called genetically encoded voltage 
indicators (GEVIs), which directly report elec- 
trical activity. This approach has undergone 
major developments” in the past few years, 
but, until now, could not detect fast neuronal 
activity in live animals. The ability to read and 
discriminate across the entire bandwidth of 
action-potential frequencies is key, because 
specific activities underlie distinct behaviours. 
Furthermore, some cells fire at high rates, and 
slow sensors such as GECIs might not discrim- 
inate such cells from their slower neighbours. 

Gong et al. developed a GEVI that can 
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Figure 1 | Sensor witha light touch. Gong et al.’ have developed a genetically encoded voltage sensor 
that rapidly and accurately detects neuronal activity in live animals. The sensor is comprised of a microbial 
rhodopsin protein, Ace, fused to a fluorescent protein, mNeonGreen. mNeonGreen is excited by blue- 
green light and emits green-yellow fluorescence. Ace spans the cell membrane and absorbs green-yellow 
light in a voltage-dependent manner, with more light absorbed during action potentials, when the voltage 
increases across the membrane. Thus, the overall level of green—yellow light emitted by the fusion protein 
provides a readout for electrical activity, with higher light emission indicating lower membrane voltage or 
inactivity. (Adapted from the Gradinaru Group, California Institute of Technology.) 
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monitor fast neuronal activity in single cells in 
live flies and mice. The authors used a micro- 
bial rhodopsin protein called Ace as the basis 
for their voltage sensor. Microbial rhodopsins 
are light-sensitive ion channels, and were ini- 
tially adopted in neuroscience for their ability to 
generate electrical currents and so to modulate 
neuronal activity’. More recently, these proteins 
have been used to monitor electrical currents 
because they fluoresce in a voltage-dependent 
manner’. They are fast and sensitive sensors, but 
their use in live organisms” has been hampered 
by the fact that they fluoresce only weakly. 

The researchers bypassed this obstacle 
by fusing Ace with the fluorescent protein 
mNeonGreen. In this configuration, blue- 
green light excites mNeonGreen, which 
emits green—-yellow fluorescence. A portion 
of this fluorescence is absorbed in a volt- 
age-dependent manner by Ace, causing 
mNeonGreen-emitted fluorescence to 
decrease as the membrane voltage rises and 
neuronal activity increases, and to increase 
as the membrane voltage falls (Fig. 1). 
In vitro, the Ace-mNeon fusion protein acts 
six times faster and can resolve closely spaced, 
repeating action potentials much more accu- 
rately than similar protein fusions’. 

To assess the capabilities of their tool in vivo, 
Gong and colleagues compared it with GECIs 
in live mice and flies. Measurements taken 
using Ace-mNeon during a visual task cor- 
roborated previous measurements taken 
with GECIs. In mice, Ace-mNeon flawlessly 
reported single action potentials in neurons at 
the surface of the brain’s cortex region, 20 times 
faster than is possible using GECIs. This is an 
impressive achievement, because intact mam- 
malian tissue is opaque and can be naturally 
fluorescent — both of which are factors that 
can mask the signal from fluorescent proteins. 

In flies, Ace-mNeon recorded more than 
18,000 action potentials with perfect accuracy, 
and detected odour-evoked subthreshold and 
fast voltage changes that a GECI failed to pick 
up. Furthermore, the authors used the protein 
to track voltage propagation from one side of 
a cell to the other with submillisecond preci- 
sion. Such precision tracking was previously 
unachievable in live flies. 

Although the sensor’s performance is 
impressive, major challenges remain before it 
can replace GECIs in vivo. First, the authors 
used conventional fluorescence microscopy for 
in vivo imaging. The effectiveness of this type 
of imaging for sensor detection relies on sparse 
expression of Ace-mNeon, limiting the num- 
ber of cells that can be imaged concurrently. 
Second, for maximum impact, a fast sensor 
requires fast imaging, but imaging speed and 
field of view are inversely correlated in current 
imaging techniques, so rapid imaging limits 
the ability to simultaneously investigate many 
cells. The combination of fluorescence micro- 
scopy and limited field of view meant that 
Gong et al. could study only a handful of cells 


at a time. A third challenge is that, although 
mNeonGreen is three times more stable to 
light than other rhodopsin-paired fluores- 
cent proteins, extended continuous imaging 
sessions still ‘bleach’ the protein, decreas- 
ing its fluorescence. This limitation could be 
bypassed by using multiple short exposures, 
or by spacing measurements widely enough for 
protein turnover to replace the photobleached 
sensors. 

The benefits of using GEVIs such as 
Ace-mNeon to image activity in live animals 
are undeniable. Nonetheless, better hard- 
ware is required to realize the full potential of 
these voltage reporters. Until that is available, 
calcium sensors will remain the gold stand- 
ard for studying densely labelled cell popula- 
tions simultaneously over extended imaging 
sessions, especially in deep brain areas. The 
development of technologies such as micro- 
endoscopy’ and fibre photometry’ has enabled 
calcium imaging of subcortical brain regions, 
and fine-tuning these techniques for use 
with GEVIs is an exciting possibility for the 
future. Overall, Gong and colleagues’ study 


A lizard that 


highlights the power of microbial rhodopsins, 
especially when paired with strongly fluor- 
escent proteins, and the need for continued 
development of these tools hand-in-hand with 
microscopy techniques. = 
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generates heat 


Birds and mammals generate heat to regulate body temperature, but most 
non-avian reptiles cannot. The discovery of endothermy during the reproductive 
period of a tegu lizard sheds light on the evolution of this characteristic. 


COLLEEN G. FARMER 


he avian and mammalian lineages 

diverged 320 million years ago, and 

since that time both lineages have 
converged on a radically different approach 
to life from that of their common ancestor. 
Birds and mammals are endotherms, mean- 
ing they use internal heat to regulate their body 
temperature; their ancestor, and many extant 
animals such as amphibians and non-avian 
reptiles, are ectotherms that rely on external 
heat sources (Fig. 1). Understanding the con- 
vergent evolution of endothermy in birds and 
mammals is a central question in evolutionary 
physiology, because thermal biology is linked 
to fundamental traits such as body size, food 
requirements and aspects of reproduction. 
Writing in Science Advances, Tattersall et al.’ 
report the remarkable discovery that a lizard 
species uses endothermy during its reproduc- 
tive period. Their finding supports the idea** 
that the ability to exert control over tempera- 
ture during reproduction was the common 
selective agent that drove the evolution of 
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endothermy in birds and mammals. 

The transition from aquatic to terrestrial 
habitats presented animals with new chal- 
lenges to reproduction; chief among these was 
the fact that eggs laid on land are at risk of des- 
iccation and are subject to greater fluctuations 
in temperature than eggs laid in water. One 
lineage of animals — the amniotes — evolved 
eggs containing a series of fluid-filled mem- 
branes, which reduced the risk of desiccation 
(Fig. 1). Many amniotes further evolved an 
ability to exert control over temperature dur- 
ing reproduction. For example, viviparity 
(giving birth to live young rather than laying 
eggs) allows females to control developmental 
temperature by gaining heat through basking, 
and has evolved independently more than 
100 times in lizards and snakes*. 

Tattersall et al. studied black and white tegu 
lizards (Salvator merianae), which inhabit 
tropical, subtropical and temperate climates 
throughout the plains east of the Andes Moun- 
tains. During autumn and winter, the lizards 
hibernate in burrows, after which their repro- 
ductive phase begins. Males undergo a surge in 
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Figure 1 | Thermal strategies during reproduction. a, Most fishes and 
aquatic amphibians are ectotherms, which rely on external sources of heat. 
These animals lay many small eggs in water, exerting no parental control of 
temperature during reproduction. That strategy changed with the evolution 
of amniotes — animals whose eggs have fluid-filled membranes that allow 
development on land. Mammals and birds have both evolved endothermy, 
meaning they generate body heat, which they can use to incubate eggs 


testosterone and gonadal growth, emerge from 
their burrows and establish territories, but ini- 
tially forgo foraging. After the reproductive 
period, they reduce activity, feed heavily and 
gain weight”®. When females end hibernation, 
they mate and deposit yolk in their eggs, which 
entails a heavy energy investment — clutch 
mass is typically about 40% of body mass’®. 
Clutches are laid in nests made of various 
materials, including moist grass, small sticks 
and other litter®, which probably improves the 
insulating properties of the nest. 

After laying, the females remain with the 
eggs for up to around 75 days* with little or 
no foraging activity. Female attendance greatly 
influences nest temperature; one study found 
attended nests to be 5°C warmer than a control 
nest where females were barred from brood- 
ing’. Because these lizards are capital breed- 
ers — that is, their reproduction is decoupled 
temporally from food acquisition and assimi- 
lation — changes in body temperature during 
their reproductive period cannot be explained 
by an increase in metabolism associated 
with feeding. 

Tattersall et al. investigated the relationship 
between reproduction and thermoregulation 
in sexually mature tegu lizards reared in a cap- 
tive colony. The body temperatures of both 
male and female lizards were equal to their 
burrow temperatures during most of the hiber- 
nation period, except for between days 160 and 
180, during which the researchers recorded 
an increase in body temperature above bur- 
row temperature. This is the period in which 
the lizards rouse from hibernation and 
begin the reproductive period. The lizards sup- 
plemented their endogenous heat production 
by basking to gain heat during the day, retreat- 
ing to their burrows at night. Remarkably, body 
temperature remained elevated throughout the 
night, whereas during the non-reproductive 


season, body temperature equilibrated with 
the temperature of the burrow. 

With these observations, Tattersall et al. 
have established that, during the reproductive 
period and when insulated by a burrow, these 
relatively small (around 2-kilogram) lizards 
can generate heat that raises their body tem- 
perature by up to 10°C above ambient, and 
that this thermogenesis is not related to feed- 
ing or activity. Furthermore, the observations 
refute conventional wisdom that small animals 
lacking body insulation, such as hair and feath- 
ers, cannot significantly increase their body 
temperature. 

The authors also placed reproductive-phase, 
fasting lizards in a temperature-controlled 
chamber for 8 days, and found that they main- 
tained body temperatures that were greater 
than ambient. Disturbing the lizards caused 
their body temperatures to decline, possibly 
owing to increased heat dissipation as a result 
of elevated peripheral blood flow. This obser- 
vation may explain why endothermy has been 
missed by other researchers, who have meas- 
ured body temperature in disturbed animals 
rather than quiet, undisturbed animals. 

Tattersall and colleagues’ work not only 
provides the first evidence of endothermy 
in a lizard, but also complements previous 
findings of endothermy during reproduction 
in pythons”. Like the tegu lizards, diamond 
pythons (Morelia spilota) construct insulated 
nests and achieve body temperatures of up to 
13°C above ambient when brooding". We now 
know that reproductive endothermy is not an 
oddity of one clade of snakes. Indeed, there is 
increasing evidence that many species of bird 
and mammal improve their capacities for ther- 
mogenesis and endothermy during reproduc- 
tion (reviewed in refs 2, 3). 

The selective drivers for the evolution of 
endothermy are debated. However, convergent 
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(monotremes and birds) or retained embryos (marsupials and placentals). By 
contrast, lepidosaurs, which include lizards and snakes, are ectotherms that 
mostly exert reproductive temperature control through egg incubation 

after basking or by retaining embryos through to live birth. b, Tattersall 

et al.' report that black and white tegu lizards (Salvator merianae), 

which are ectotherms during most life stages, use endothermy during 

their reproductive period. 


evolution is one of the strongest lines of 
evidence for the adaptive significance ofa trait. 
Thus, this discovery in lizards corroborates the 
idea that the initial selective benefits of the evo- 
lution of endothermy in birds and mammals 
were reproductive~’. Studies of pythons have 
shown that the thermal regime during incuba- 
tion affects the incubation period as well as the 
characteristics of the hatchlings (such as initial 
growth rates, escape behaviour and willingness 
to feed’*), providing several potential bases on 
which reproductive endothermy may provide 
an evolutionary advantage. 

Intriguing questions remain. How do these 
lizards generate body heat, and do so only at 
certain times? Precisely how does thermo- 
genesis facilitate the lizards’ reproduction — 
might it expand the geographical range over 
which this species can reproduce, or alter the 
time window for reproduction? Are tegus and 
pythons alone, or are there other reproduc- 
tively endothermic non-avian reptiles? Repro- 
ductive endothermy may yet be discovered in 
other species if they are studied using methods 
that do not disturb them during the reproduc- 
tive period, when insulating nests reduce rates 
of heat dissipation and metabolism is increased 
by the synthetic demands of reproduction. = 
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Small RNA with 
a large impact 


A simultaneous comparison of the RNA molecules expressed by Salmonella 
bacteria and human cells during infection reveals how a bacterial small RNA alters 
the transcript profiles of both the bacteria and the host cells. SEE ARTICLE P.496 


MATTHIAS P. MACHNER & GISELA STORZ 


hat happens when bacteria 
encounter or enter host cells? How 
does each of the species respond, 


the bacteria to survive in their new environ- 
ment and the host cells to either tolerate 
non-harmful bacteria or defend against path- 
ogenic ones? To answer these questions, it is 
imperative to understand how gene transcrip- 
tion in both cells changes during the encounter. 
Over the years, approaches applied to this 
problem have ranged from in vivo gene- 
expression technology’ to sequencing the full 
complement of bacterial or host-cell tran- 
scripts” (the transcriptome). However, such 
analyses have largely focused on messenger 
RNAs and have profiled either the bacteria 
or the host, not both at once. In this issue, 


Salmonella 


Invasion 


Host gene expression 
(LncRNAs, SOCS3) 


Westermann et al.’ (page 496) go beyond the 
individual organisms by using dual RNA- 
seq, an approach that simultaneously profiles 
bacterial and host transcriptomes throughout 
the course of an infection. 

The RNA-seq method takes advantage of 
the ever-increasing depth of sequencing (the 
number of reads for a particular sample) now 
possible. Westermann et al. first assessed 
whether the dual RNA-seq approach accu- 
rately reflected known gene regulation in 
human HeLa cells and in the bacterium Salmo- 
nella enterica serovar Typhimurium (hereafter 
Salmonella), a common cause of food 
poisoning, during infection. The authors’ 
data confirmed that, as previously reported’, 
transcription of invasion-related genes in the 
genomic region known as Salmonella patho- 
genicity island 1 (SPI-1) was reduced after 


Host cell 


Intracellular growth 


Figure 1 | PinT orchestrates gene expression in Salmonella and its host cells. When Salmonella bacteria 
invade cells, expression of the genes sopE and sopE2 facilitates bacterial invasion. After the bacteria are 
internalized, the levels of the transcripts from these genes fall. By simultaneously monitoring the RNA 
molecules present in both Salmonella and host cells over the course of an infection, Westermann et al.* 
found that a small regulatory RNA expressed in Salmonella, which they name PinT, induces this repression 
by base-pairing with the sopE and sopE2 messenger RNAs. PinT also base-pairs with the mRNA that 
encodes CRP, a protein that activates transcription of genes encoding SPI-2 proteins. This repression is 
reduced later in infection, allowing the SPI-2 proteins to regulate the bacterium’s intracellular growth. 

The authors also observed differences in host-cell transcripts when the cells were infected with Salmonella 
mutants lacking PinT, including altered levels of long non-coding RNAs (IncRNAs) and the mRNA for 
SOCS3. This suggests that PinT targets other bacterial genes that influence host-cell gene expression. 
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bacterial internalization, whereas transcription 
of SPI-2 genes, which promote intracellular 
survival, increased. 

Having validated the sensitivity of the 
approach, Westermann et al. focused on 
mRNAs and regulatory RNAs whose expres- 
sion changed during the course of the 24-hour 
infection. In bacteria, small regulatory RNAs 
(sRNAs) that base-pair with target mRNAs to 
modulate the mRNAs stability or translation 
are integral to a wide range of stress responses, 
including the response to host cells®. Thus, 
the authors were intrigued by an 80-nucleo- 
tide sRNA, which they denoted PinT, whose 
expression was highly induced during infec- 
tion, and which was activated by the bacterial 
PhoP/Q system, known to be crucial for 
Salmonella survival in the intracellular 
environment. 

A striking finding of the dual RNA-seq 
analysis was that tens of bacterial and hun- 
dreds of host-cell transcripts were affected 
merely by the presence or absence of PinT 
(Fig. 1). On the bacterial side, overproduction 
of PinT led to reduced levels of the mRNAs 
encoding SopE and SopE2, two SPI-1 effec- 
tor proteins that mediate host-cell invasion by 
Salmonella. These mRNAs were elevated in 
strains lacking the pinT gene. By mutating the 
pinT, sopE and sopE2 sequences, the authors 
revealed that the inhibitory effect of PinT 
occurred through direct base-pairing with 
the mRNAs. Dual RNA-seq also revealed a 
role for PinT in repressing SPI-2 genes later 
in infection. However, control of these genes 
was indirect and occurred through PinT base- 
pairing with the mRNA that encodes the cyclic 
AMP receptor protein (CRP), an activator of 
transcription of SPI-2 genes. These data indi- 
cate that PinT, on bacterial internalization, 
controls the temporal expression of both 
SPI-1 effectors and SPI-2 virulence genes, 
thus facilitating the bacterium’s transition 
from an invasive state to a state of intracellular 
replication. 

Westermann et al. then compared the 
transcriptomes of the host HeLa cells chal- 
lenged with either wild-type Salmonella or a 
strain lacking PinT. They discovered numerous 
changes in cells infected with the PinT-lacking 
mutant, including altered levels of many long 
non-coding RNAs (IncRNAs), hyperactivation 
of mitochondrial genes, increased abundance 
of mRNAs for proteins involved in innate 
immune pathways (such as the interleukin-8 
mRNA) and accelerated activation of SOCS3, 
a protein that regulates the inflammatory 
JAK-STAT signalling pathway. The last find- 
ing is of particular interest, because properly 
balanced JAK-STAT signalling is essential for 


optimal Salmonella infection. Too little 
inflammation reduces the ability of Salmonella 
to compete with the intestinal microbiota, 
whereas too much inflammation might result 
in the bacteria being killed by the host”. 

The dual RNA-seq approach of simulta- 
neously interrogating the bacteria and host 
transcriptomes gives unprecedented insight 
into the dynamic RNA-expression landscape 
during the bacterium-host interaction. This 
method is particularly useful for analysing 
genes such as pinT that, when deleted, cause 
little or no detectable change in standard 
bacterial virulence assays, but nevertheless 
have a strong impact on mRNA levels. The 
authors’ study of PinT, however, also dem- 
onstrates some of the ongoing challenges in 
studying bacterium-—host interactions. For 
example, although the effect of PinT on host 
gene expression such as JAK-STAT signalling 
is interesting, it remains unclear which PinT- 
controlled bacterial gene(s) are responsible. 
In addition, it has become evident that there 
is substantial variation in gene expression 


STELLAR ASTROPHYSICS 


even between individual bacteria in single- 
species cultures; these differences are not cap- 
tured by approaches in which cell samples are 
sequenced in bulk. 

Finally, although dual RNA-seq eliminates 
the need to separate bacteria from the host cell 
for investigation, a comprehensive transcrip- 
tome analysis over the course of an infection 
still demands the analysis of a series of time 
points in short succession. Westermann et al. 
conducted a time series for their analysis 
of HeLa cells infected with wild-type and 
PinT-lacking Salmonella, but this approach is 
impractical for the analysis of large collections 
of bacterial mutants or isolates. Continued 
improvement in deep-sequencing technolo- 
gies, including single-cell sequencing, will 
undoubtedly circumvent some of these limi- 
tations. In the meantime, the application of 
Westermann and colleagues’ dual RNA-seq 
approach to a wide range of bacterium-host 
interactions will open a treasure trove of insight 
into transcriptional actions and reactions as 
bacteria enter and proliferate in host cells. = 


The mystery of 
globular clusters 


The discovery of multiple stellar populations — formed at different times — in 
several young star clusters adds to the debate on the nature and origin of such 
populations in globular clusters from the early Universe. SEE LETTER P.502 


ANTONELLA NOTA & CORINNE CHARBONNEL 


ince the discovery of the first globular 

cluster in 1665, these large, ancient 

agglomerates of stars — which can host 
up to a million suns — have fascinated both 
astronomers and the public. They are visible 
through small telescopes, and their exquisite 
spherical symmetry singles them out in the 
sky and makes them easy to classify. How- 
ever, their formation and evolution history 
is unclear. On page 502 of this issue, Li et al.! 
report observations of young star clusters that 
may help to crack the mystery of the oldest 
star clusters. 

Globular clusters remain gravitationally 
bound as they orbit their host galaxy, on a 
timescale comparable to the lifetime of the 
low-mass stars they host. They are 10 billion 
to 13 billion years old — their age defines the 
boundaries of the age of the Universe. These 
well-studied systems were long thought to be 
simple and to host a single population of stars 
that all formed at the same time. 

But in 2004, everything we knew about 
globular clusters changed radically. Using 


accurate Hubble Space Telescope photometry”, 
astronomers detected not one, but multiple 
stellar populations in w Centauri, one of the 
most massive globular clusters in the Milky 
Way. Subsequent studies (see ref. 4, for exam- 
ple) of other globular clusters confirmed that 
this was not an isolated finding but the discov- 
ery ofa general feature that revolutionized our 
understanding of such objects. These stellar 
populations were shown to exhibit unique 
chemical properties that are not found in any 
other stellar environment’. This means that 
these clusters are not simple at all, and have 
experienced more than one star-forming event 
during their lifetime. 

The exciting news” inspired different star- 
formation models to account for the photo- 
metric and spectroscopic properties of the 
different populations hosted by single clusters. 
For example, colliding winds from late-stage, 
medium-mass stars or ejecta from fast-rotating 
massive stars were invoked as a trigger and/ 
or an origin of the processed material that 
fuelled the second generation of star forma- 
tion. Astronomers also proposed an explana- 
tion based on a single generation of stars. They 
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suggested that very-low-mass stars with proto- 
planetary disks — gas rotating around newly 
formed stars — could acquire the observed 
properties by sweeping up material shed from 
interacting binary or rapidly rotating massive 
stars in a continual growth process. 

However, most of these theories suffer from 
major issues. Some of them imply a more 
massive original cluster than observed, with 
a substantial fraction of the original stars lost 
to the galactic halo. But this enhanced mass is 
in contrast to theoretical expectations for the 
dynamical evolution of these systems’ and to 
observations of dwarf galaxies and of young 
massive clusters in our Local Group of galax- 
ies®. The consensus in the community is that 
we urgently need alternative, innovative ideas 
to overcome the impasse. 

Enter Liand colleagues’. The authors chal- 
lenge the status quo by presenting observations 
of three massive clusters that are 1 billion to 
2 billion years old in the Magellanic Clouds, 
members of our Local Group. The authors 
show clear evidence of a late burst of star for- 
mation that occurred a few hundred million 
(up to one billion, within the errors) years after 
the clusters’ initial formation epoch. Multiple 
stellar population sequences are visible even 
in the authors’ raw diagrams. The colours of 
the younger stellar sequences are consistent 
with an enhanced abundance of helium. This 
would be expected as a result of the chemical 
anomalies in globular clusters that are due to 
hydrogen burning at high temperature (helium 
is the main yield of hydrogen burning). 

To explain the data, Li et al. propose that 
such clusters orbiting within the gaseous disks 
of their host galaxies could accrete sufficient 
gas reservoirs to form the next generation of 
stars. They suggest that this mechanism may 
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account for the ubiquitous multiple stellar 
populations in globular clusters, assuming 
that the observed massive young star clus- 
ters in the Magellanic Clouds are the modern 
counterparts of the old globular clusters. It is 
a plausible working hypothesis, but not a con- 
sensus view: some astronomers think that the 
formation mechanism of young massive star 
clusters in our neighbouring galaxies might 
be different from the globular-cluster forma- 
tion that occurred in the early Universe. The 
link between young and old clusters has yet 
to be fully established. The determination of 
whether the chemical properties of the young- 
est stars in the Magellanic clusters are similar 
to those of globular-cluster stars will eventually 
address the issue. 

Li and colleagues may not yet have all the 
answers needed to put the proposed general 
model on a firm quantitative footing. For 
example, the mass in the younger observed 


population sequence is much smaller than 
that typically observed in globular clusters. 
There is no explanation given for the origin 
of the additional helium that is claimed to 
be present in the youngest stars, nor for the 
fact that the accreted material must have 
the same metal content as the original proto- 
cluster, even though the host galaxies might 
have evolved chemically between the distinct 
star-formation events. (By ‘metal’, astrono- 
mers mean any element after helium in the 
periodic table.) 

Nevertheless, the findings present an 
innovative approach that deserves further 
attention. It will certainly advance the ongoing 
debate, as well as trigger original thoughts, 
future observations and corresponding inter- 
pretations. And it could lead to a final, robust 
explanation in the not too distant future — 
an example of how scientific debate works 
at its best. m 


A mechanism 
for myelin injury 


The cells that insulate neuronal processes with a myelin membrane sheath are 
damaged during stroke. Data now show that an influx of calcium ions mediated 
by the TRPA1 protein contributes to myelin injury. SEE LETTER P.523 


AIMAN S. SAAB & KLAUS-ARMIN NAVE 


ormal brain function requires the 

rapid transmission of information 

between brain regions along neu- 
ronal projections called axons. The ability of 
axons to conduct information depends on the 
well-being of a supporting class of glial cells 
called oligodendrocytes', which speed up con- 
duction by enveloping the axonal projections 
in a multilayered membrane sheath called 
myelin. Damage to oligodendrocytes and 
the myelin sheaths that they produce has 
been associated with axonal dysfunction in 
numerous disorders, including cerebral palsy, 
spinal-cord injury, multiple sclerosis and 
stroke. In this issue, Hamilton et al.’ (page 523) 
report that the mechanisms that underlie this 
damage are more complex than commonly 
thought and involve the activation of a 
channel protein called TRPA1. 

Duringa stroke, the local loss of blood flow 
in the brain, known as ischaemia, causes dam- 
age to neurons and glial cells, including oligo- 
dendrocytes. Even transient ischaemia causes 
permanent defects in axonal conductivity that 
can be only partially restored when the supply 
of oxygen and glucose to the tissue is 


re-established’. Ischaemia causes the release 
of the neurotransmitter molecule glutamate, 
which excites oligodendrocytes. Some evi- 
dence*® indicates that blocking glutamate- 
receptor proteins reduces myelin damage and 
axon dysfunction during ischaemia. AMPA/ 
kainate-type and NMDA-type glutamate 
receptors are ion-channel proteins that, when 
activated, allow positively charged ions such 
as sodium (Na*) and calcium (Ca”") to flow 
into the cell. High levels of Ca” are toxic to 
cells, and so the death of oligodendrocytes and 
injury to myelin sheaths during ischaemia are 
widely thought to reflect the overactivation of 
these glutamate receptors” ’ — the same mech- 
anism by which ischaemia damages neurons. 
Hamilton et al. revisited this issue in brain 
slices from the rat cerebellum. They found 
evidence that, in the cerebellar white matter 
(a tissue that contains a high density of mye- 
linated axons) deprivation of oxygen and 
glucose — a model of ischaemia-evoked oli- 
godendrocyte and myelin damage — might 
not result in glutamate-receptor overactivation 
alone. First, the authors characterized ischae- 
mia-evoked membrane currents in oligoden- 
drocytes and monitored the corresponding 
intracellular changes in Ca?*, Na‘, potassium 
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ions (K*) and magnesium ions. Strikingly, 
although oxygen and glucose deprivation did 
cause an inward flow of ions in oligodendro- 
cytes that was mediated by glutamate release, 
this current flow seemed to be triggered by an 
increase in extracellular K* concentration and 
closure of K*-channel proteins, rather than by 
glutamate receptors. 

Similar to an earlier report’, Hamilton 
and colleagues found that ischaemia led to 
an increase in intracellular Ca”* levels in 
oligodendrocytes and their axon-ensheathing 
processes. However, inhibition of NMDA 
and AMPA/kainate receptors did not prevent 
Ca” influx. These data suggest that glutamate 
receptors are not the primary channels respon- 
sible for the ischaemia-evoked elevations of 
Ca” concentrations that cause myelin damage. 

Next, the authors demonstrated that the 
ischaemia-evoked elevation of extracellular 
K’ levels leads to increased hydrogen-ion 
levels (acidification) in oligodendrocytes 
that in turn triggers Ca” influx (Fig. 1). By 
increasing local intracellular H* levels in oligo- 
dendrocytes and measuring intracellular Ca”* 
changes, Hamilton et al. investigated which 
other channels might contribute to ischaemia- 
evoked Ca” influx. After taking into account 
the known physiological properties of different 
ion-channel proteins and testing the effects of 
channel stimulators and inhibitors, the authors 
concluded that the channel responsible is 
TRPA1 — a widely expressed member of the 
family of transient receptor potential (TRP) 
channels (Fig. 1). Activation of TRPA1 chan- 
nels allows Ca”", Mg* and Na‘ to enter the cell. 

In line with Hamilton and colleagues’ con- 
clusion, ischaemia-triggered Ca”* entry was 
considerably reduced in white-matter cerebel- 
lum slices from mice in which TRPA1 had been 
deleted. However, the authors did not test the 
in vivo effects of ischaemia on white matter in 
these mice. Instead, using isolated optic nerves 
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Figure 1 | Myelin injury in a model of ischaemia. Oligodendrocyte cells wrap around neurons and 
produce an insulating stack of membranes called myelin that speeds up neuronal signal transmission. 
When brain regions are deprived of glucose and oxygen (a condition called ischaemia), oligodendrocytes 
and myelin become damaged. Hamilton et al.’ report that this damage is caused, in part, by a pathway 
that involves an increase in extracellular levels of potassium ions (K*). Through an unknown mechanism 
(dashed arrow), increased K* levels trigger an intracellular rise in hydrogen ions (H*). This reduces the 
pH in the cell, activating TRPA 1-channel proteins and leading to an influx of calcium ions (Ca™). High 
levels of Ca are toxic to oligodendrocytes and damage myelin. Glutamate-receptor proteins such as 
NMDaA-type receptors can also mediate Ca” influx, but whether they have a role in ischaemia-evoked 


myelin damage in this setting is unclear. 


from rats, the authors showed that blocking 
TRPAI channels during oxygen and glucose 
deprivation reduced myelin damage, but had 
no effect on axonal injury. Thus, unlike the 
case with oligodendrocytes, damage to axons is 
mediated not by the activation of TRPA1 chan- 
nels, but by other mechanisms of Ca”* influx. 

Hamilton et al. also detected spontaneous 
Ca’*-level changes in the myelin of some 
normal oligodendrocytes, which were not 
deprived of glucose and oxygen. This spon- 
taneous activity is probably indicative of 
axonal activity and the concurrent release 
of glutamate. So, given that myelin contains 
AMPA/kainate and NUDA receptors” ’, and 
that ischaemia-evoked Ca**-level changes in 
optic nerves have previously been shown to 
be caused by activation of glutamate recep- 
tors’, why did the authors not find evidence 
for glutamate-evoked Ca™ changes? 

The levels of the messenger RNAs that 
encode NMDA receptors are low in mature oli- 
godendrocytes, but the levels of TRPA] mRNA 
are almost undetectable’, and the researchers 
still detected TRPA1-mediated Ca™ influx in 
cerebellar oligodendrocytes. Evidence indicates 
that NMDA receptors move to myelin processes 
that face the axonal surface as oligodendrocytes 
mature”. This might help to explain why glu- 
tamate-evoked Ca” responses are difficult to 
detect in oligodendrocytes. Alternatively, per- 
haps oligodendrocytes in the optic nerve, which 
myelinate axons of only glutamate-releasing 
neurons, are different from the cerebellar oli- 
godendrocytes that also myelinate neurons 
producing a different neurotransmitter, GABA. 

In contrast to the current paper, a study 
published last month? provided evidence that, 


in optic nerves, NMDA and AMPA/kainate 
receptors do indeed act to mediate Ca™ influx 
in mature myelin. Such influx in myelin is 
mediated by axonal activity and the vesicular 
release of glutamate. However, beyond the level 
of Ca™ signals, the physiological functions of 
oligodendroglial NMDA receptors still need 
to be resolved. 

Hamilton and colleagues’ study demonstrates 
that in vitro damage to myelin owing to oxygen 
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and glucose deprivation is more complex than 
anticipated, with TRPA1 channels joining the 
scene. Whether the results apply in vivo or to 
human TRPA1 channels, which have a differ- 
ent pharmacological response from those of 
mice”, remains to be seen. Trials of NMDA- 
receptor blockers in people who have had a 
stroke have largely failed, in part because the 
drugs were administered too late, given the low 
doses at which their adverse side effects can 
be tolerated. Identifying other pharmacologi- 
cal targets raises the hope that safer drugs may 
be found. = 
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Technological leap for 
Sweat sensing 


Sweat analysis is an ideal method for continuously tracking a person’s 
physiological state, but developing devices for this is difficult. A wearable sweat 
monitor that measures several biomarkers is a breakthrough. SEE LETTER P.509 


JASON HEIKENFELD 


thletics trainers, physicians and even 
Ax pharmacists can take a sample 

of blood, saliva or urine and measure 
a whole panel of analytes (dissolved com- 
pounds) to reveal your physiological status 
at the time of sample collection. But none of 
the measurement techniques involved is con- 
veniently portable or can continuously collect 
data for many hours or days — with the excep- 
tion of glucose monitoring, which typically 


requires blood samples to be drawn by needle 
at regular intervals. On page 509 of this issue, 
Gao et al.' reporta truly non-invasive, con- 
tinuous biomonitoring device: a wearable, 
Bluetooth-enabled band containing a panel of 
sensors for sodium, potassium, lactate, glucose 
and skin temperature. And rather than using 
the body fluids mentioned above, the device 
measures analytes in human sweat. 

Making a wearable band that electro- 
chemically senses sweat analytes is extremely 
difficult. The sensors must be prepared from 
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Figure 1 | Analysing sweat. Many biomarkers for a subject’s physiological state, such as glucose, lactate 
or chloride ions (CI’), enter sweat from cells that form the walls of sweat ducts inside the skin. Gao et al.' 
report devices that can be worn as wrist or head bands, and which continuously analyse several molecules 
and ions in sweat using sensors placed on the skin’s surface. (Adapted from ref. 2.) 


scratch from basic chemicals — they can't just 
be purchased like the accelerometer chips used 
in smart watches and activity trackers. Another 
challenge is creating electronics that work with 
the ultra-high electrical impedance of the sen- 
sors. Basically, you need to figure out how to 
take a potentiostat — a device used to control 
electroanalytical experiments that typically 
weighs more than 2 kilograms — and make it 
so small and thin that you can wrap it around 
your wrist. 

For decades, sweat analysis was relegated 
mainly to medical labs; this hampered its 
broader use for two reasons. The first is obvi- 
ous: who can afford to tote around a cadre of 
trained medical staff and the associated equip- 
ment? The second reason is that conventional 
clinical methods for sweat collection and sens- 
ing could lead to inaccurate measurements. 
This is because existing clinical infrastructure 
is ill-equipped to work with the tiny volumes 
obtainable from sweat. 

Gao et al. address these problems by putting 
tiny electronic sweat sensors right up against 
the skin (Fig. 1) — an approach that others 
have also reported”*. Sweat and its analytes are 
thus quickly measured as they emerge onto the 
skin’s surface. These sensors are highly electro- 
chemically selective’, and, despite their minia- 
ture sizes (on the order of square millimetres 
or smaller), they can distinguish a single type 
of ion or molecule from thousands of others 
in sweat. 

This ability is a real leap forward for wear- 
able devices, and couldn't have been made just 
by improving the rudimentary electrical or 
optical sensors found in commercially avail- 
able activity trackers. For example, commercial 
trackers at best use a simple measure of electri- 
cal conductance on skin as a non-quantitative 
measure of sweat rate, whereas measuring 
sodium and potassium concentrations with 
electrochemical sensors quantifies sweat rate 


and could also quantify the total amount of 
electrolytes lost during exercise. 

Importantly, Gao and colleagues’ devices 
use many sensors. Previous devices have 
been limited to a single sensor, which could 
generate misleading information — ifa stand- 
alone sensor shows a signal change, it could 
be because sweating has stopped, because the 
sensor has fallen away from the skin, or even 
because the sensor is failing. Having multiple 
sensors can clarify what is happening. For 
example, potassium levels in sweat are fairly 
invariant with sweat rate and with normal 
physiological changes in the body’. So if there 
is a change in sodium, lactate or glucose signals 
while the potassium signal holds steady, then 
the other sensor changes can be trusted to be 
caused by a real physiological event. 

The Bluetooth capability of their devices 
enabled Gao et al. to monitor continuously 
recorded data for at least an hour, and the 
types of sensors and electronics used should, 
in theory, enable such monitoring for 24 hours 
or more. Previously reported devices lacked 
Bluetooth. Having this capability is certainly 
commercially relevant, and start-up compa- 
nies have developed functional, but unpub- 
lished, Bluetooth sweat-sensing technology 
in the form of watches’ or patches’. 

The potential applications of wearable 
sweat-sensing devices extend well beyond 
those related to exercise. For example, the 
hormone cortisol is a marker of stress, and its 
concentrations in sweat are similar to those 
found in blood’, making it a possible target for 
future monitoring. Even small-molecule drugs 
and their metabolites come out in sweat, so this 
body fluid might one day be used to moni- 
tor the amount of active drug in a patient’s 
blood — helping to avoid rises and falls in drug 
levels between doses. 

Today’s commercially available wearables 
largely rely on decades-old technology. Their 
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market success is due to a convergence of 
improved affordability and ergonomics and a 
rapidly growing consumer awareness of health. 
The next watershed in wearables will probably 
be driven by scientific breakthroughs. Sweat 
biomonitoring arguably has the greatest poten- 
tial among the emergent non-invasive tech- 
nologies. But such potential will go unfulfilled 
unless scientists dig into the unresolved ‘hard 
science’ of this approach. 

For example, cutting-edge, commercial, 
point-of-care blood sampling and sensing 
technologies proudly claim that as little as 
20 microlitres of blood are needed for some 
tests. But square-millimetre-sized sensors on 
the skin will receive, at most, several nanoli- 
tres of sweat per minute’. Just placing such 
sensors against the skin does not fully resolve 
this problem, because the gap between a sen- 
sor and the rough surface of the skin is so large 
that it takes periods of tens of minutes for fresh 
samples of sweat to displace previously accu- 
mulated sweat”. Although not exactly real-time 
monitoring for your workout, this a good start, 
and is certainly better than repeated blood 
draws. 

Consider also possible applications involv- 
ing situations in which you are unlikely to be 
sweating, such as monitoring your medication 
levels while at the office. Methods exist for 
locally stimulating sweat by iontophoresis — 
that is, using a tiny electrical current to drive 
a chemical sweat stimulant into the skin. But 
these methods were commercialized for col- 
lecting single sweat samples, not for repeated 
or prolonged sweat monitoring throughout a 
day or week. Alternative methods must there- 
fore be developed. 

Fortunately, the remaining challenges for 
sweat biomonitoring do not seem to be funda- 
mental impediments. As Gao and colleagues’ 
work, and that of others”, reveals the scale 
of the opportunities in this field, researchers 
will undoubtedly come up with innovations to 
transform technology that is currently merely 
appealing into something that, one day, you 
could not imagine living without. m 
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Allowable CO, emissions based on regional 
and impact-related climate targets 


Sonia I. Seneviratne!, Markus G. Donat?, Andy J. Pitman?*, Reto Knutti! & Robert L. Wilby* 


Global temperature targets, such as the widely accepted limit of an increase above pre-industrial temperatures of two 
degrees Celsius, may fail to communicate the urgency of reducing carbon dioxide (CO,) emissions. The translation of CO» 
emissions into regional- and impact-related climate targets could be more powerful because such targets are more directly 
aligned with individual national interests. We illustrate this approach using regional changes in extreme temperatures and 
precipitation. These scale robustly with global temperature across scenarios, and thus with cumulative CO» emissions. 
This is particularly relevant for changes in regional extreme temperatures on land, which are much greater than changes 
in the associated global mean. 


he Intergovernmental Panel on Climate Change (IPCC) Fifth This simple relationship between CO, emissions and changes in AT ip 

Assessment Report included a figure in the Summary for (Fig. 1) has helped overcome one communication barrier for the pub- 

Policymakers of Working Group 1 that linked global mean tem- _ lic in relating greenhouse gas emissions to the climate system response. 
perature changes (AT/iop) to total CO2 emissions from 1870 onwards! However, there remains another obstacle to the full appreciation of asso- 
(Fig. 1). This figure is compelling because it shows a clear linear rela- _ ciated climate impacts, namely, the translation of changes in global mean 
tionship between cumulative CO, emissions and a measure of the global __ temperature to regional-scale consequences for society and the environ- 
climate response. The obvious consequences are (1) that every tonne of ment. In this Perspective, we demonstrate the feasibility and utility of 
CO, contributes about the same amount of global warming no matter _ quantitatively relating global cumulative CO2 emissions to regional cli- 
when it is emitted, (2) that any target for the stabilization of AT, implies _ mate targets. We illustrate this approach by scaling changes in hot and 
a finite CO, budget or quota that can be emitted, and (3) that global net cold extreme temperatures and heavy precipitation events with changes 


emissions at some point need to be zero”®. in the global mean temperature. 
Cumulative total anthropogenic CO, emissions from 1870 (Gt CO,) Figure 1 | Simulated global mean surface 
1,000 2,000 3,000 4,000 5,000 6,000 7,000 8,000 temperature increaseasafunctionof == 
: : 1 ; ; ; : ; , cumulative total global CO emissions. This is 
2100 figure SPM.10 from ref. 1. It was derived from 


various lines of evidence. Model results over 

the historical period (1860-2010) are indicated 
in black. The coloured plume illustrates the 
multi-model spread over the four Representative 
Concentration Pathway (RCP) scenarios. The 
multi-model mean and range simulated by 
Coupled Model Intercomparison Project Phase 
5 (CMIP5) models, forced by a CO; increase of 
1% per year, is given by the thin black line and 
grey shading. For a given amount of cumulative 
CO, emissions, the 1% per year CO2 simulations 
exhibit less warming than those driven by RCPs, 
which include additional non-CO, forcings. 
Temperature anomalies are given relative to the 
1861-1880 base period; emissions are given 
relative to 1870. 


Temperature anomaly relative to 1861-1880 (°C) 
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BOX | 
Linking regional extremes, global 
means, and cumulative emissions 


We use output from the climate model simulations contributing 
to CMIP5°®. Here we present results for climate extreme 

indices representative of the hottest day (Tx,) and coldest 

night (Tn) of the year, as well as the annual maximum 
consecutive 5-day precipitation total (Rysday). Climate extreme 
indices®’” were calculated for the historical simulations®°® and 
future projections®? from the CMIP5 ensemble. We use one 

run (rlilp1) from models that provide historical simulations 
during 1861-2005, as well as RCP8.5 and RCP4.5 scenario 
simulations for the twenty-first century (see Supplementary 
Table 1). For the analysis of transient changes we concatenated 
historical (1861-2005) and RCP (2006-2099) simulations. 

We restricted our analyses to 1861-2099, which was common 
to all model runs. Global mean temperatures were calculated as 
the area-weighted global averages of annual mean temperatures. 
Extreme index fields were remapped to a common 2.5° x 2.5° 
analysis grid to allow calculation of local ensemble averages and 
ensure that the same regions from each model contribute to the 
regional analyses. 


Scatter plots showing the scaling relationship between changes 
in global mean temperature (ATgiop) and regional extreme index 
changes (see Figs 3 and 4b) are based on decadal averages of the 
respective variables. These averages of local anomalies relative 
to the 1861-1880 average were calculated for moving ten-year 
windows, and moving average values were assigned to the last 
year of each window period (that is, the value for year 2010 
represents the average during 2001-2010; note that in the case 
of Fig. 1 the decadal global temperature averages are assigned to 
the year directly following that decade). These moving ten-year 
averages were also used to produce maps of local changes for 
a global mean temperature increase of 2°C (see Fig. 2). The 
indicated cumulative COz emissions corresponding to different 
global mean temperature increases (red ticks on horizontal axes 
in Figs 3 and 4b) were approximated from the RCP8.5 ensemble 
average in Fig. 1 (single values were assigned to each of the 
chosen tick marks). This means that 500 billion tonnes of carbon 
(500 GtC) are emitted for a global increase of approximately 
1.2°C, 1,000GtC for 2.35°C, 1,500GtC for 3.5°C, and 2,000 GtC 
for 4.45°C. Respective analyses regarding the scaling of extreme 
temperatures and precipitation in all 26 regions of the IPCC 
Special Report on Extremes’ and global land are provided in the 
Supplementary Information. 


Global versus regional climate targets 

Experience shows that the implications of projected global mean temper- 
ature changes tend to be underestimated at regional (and country) level, 
because the global changes are much smaller than the expected changes 
in regional temperature mean and extremes over most land areas’~'°. The 
limitations of global mean temperature as a measure of climate change 
have, for instance, been made evident by the public debate about the 
recent ‘hiatus’ decade in global warming, which has focused attention on 
changes in ATi.) instead of on the discernible worldwide impacts of the 
continued increases in radiative forcing’!!-"4. 

As illustrated in Fig. 2, a 2°C target for AT iop implies increases in both 
warm and cold temperature extremes that are greater than 2°C over most 
land regions. This is due to the land-sea contrast!** in response to radi- 
ative forcing, as well as to feedbacks (for example, from decreases in soil 
moisture, snow or ice”*!7-?°), which further amplify changes in extreme 
temperatures in some key regions. As an example, the 2°C global mean 
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temperature target implies 3°C warming in hot temperature extremes 
in the Mediterranean region (Fig. 2a) and about 5.5 °C warming in cold 
temperature extremes over land in the Arctic region (Fig. 2b). Hence, 
these changes in regional extremes are greater than those in global mean 
temperature by a factor of about 1.5 and 2.5-3 (Supplementary Fig. 1), 
respectively. As highlighted above, this stronger warming of extremes on 
land compared to that of global mean temperature is related both to the 
larger warming of mean temperature on land (Fig. 2c), as well as to an 
additional specific warming of extremes in several regions (Fig. 2a, b). 
Subjectively, such regional changes in extremes may convey the conse- 
quences of crossing the respective cumulative CO, emissions threshold 
better than the associated change in AT). (2°C), which seems relatively 
mild in comparison. 

We make the case here for more easily interpretable analyses that relate 
global cumulative CO, emissions targets to changes in regional extremes 
or other impact-relevant quantities in addition to changes in global mean 
temperature. Although the IPCC Synthesis Report”! shows cumulative 
CO, emissions alongside their “reasons for concerns’, the bars (of var- 
ious shades of red) provide only a qualitative assessment. We highlight 
here how quantitative analyses relating cumulative emissions to climate 
change at the national or regional scale could provide more targeted and 
actionable information for the decision process. 


Regional extremes versus global CO, emissions 

We thus assess the extent to which the implications of figure SPM.10 of 
ref. 1 (Fig. 1) can be expanded to relate cumulative global emissions in 
CO, with regional changes in temperature extremes (annual maximum 
and minimum temperatures; see Box 1). The result is displayed in Fig. 3 
for four example regions with relatively strong scaling (the Mediterranean 
basin; the contiguous USA; central Brazil for annual maximum daytime 
temperatures; and the Arctic for annual minimum night-time 
temperatures). For other regions, see Supplementary Figs 4 and 5. The 
analyses display the scaling of the regional changes considered with the 
changes in global mean temperature for a range of climate projections, 
and provide the associated expected allowable cumulative global CO, 
emissions (but without considering the uncertainty in translating AT jp 
to cumulative emissions). 

The results show that changes in regional extreme temperatures dis- 
play a rather linear scaling with AT,),p, which is also mostly independent 
of the emission scenario considered (Fig. 3). Hence, regional changes in 
temperature extremes can be usefully related to given cumulative CO2 
targets, without any consideration of the emission pathway. However, 
scaling for regional extremes on land is generally steeper than for AT gio 
(see also analyses for other land regions in Supplementary Figs 4 and 5). 
Hence, as expected from Fig. 2, the relationship between the increase in 
regional temperature extremes and the increase in global mean temper- 
ature typically implies a larger change of the former at more local scales. 

For instance, a 2°C warming in hot extremes (annual warmest day- 
time temperature, Tx) takes place in the Mediterranean for a change 
of 1.4°C in AT, (Fig. 3a). The corresponding allowable cumulative 
CO, emissions are therefore about 600 GtC for a 2°C warming of hot 
extremes in the Mediterranean region compared to about 850 Gt C for a 
2°C warming in global mean. Given current political tensions around the 
Mediterranean basin, implications of locally more rapid climate change 
could extend to regional impacts”, adding to wider political instability 
(see for example the purported impacts of drought in Syria?*”*). 

Scaling extreme hot temperatures in the contiguous USA and central 
Brazil (Fig. 3b, c) by AT 1p provides qualitatively similar results, but high- 
lights greater uncertainty of projections in these regions. In the contiguous 
USA, although the expected value of scaling with AT.) is greater than 
1, the uncertainty range bounds the 1:1 (identity) line. Conversely, the 
regional response in central Brazil is significantly different from the 1:1 line 
despite the larger uncertainty range compared to the Mediterranean region. 
The response of the regional changes in annual coldest night-time temper- 
atures (Tn) in the Arctic (Fig. 3d) conveys a very stark message. In this 
case, as seen in Fig. 2, the regional response is about 2.5-3 times greater for 
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Figure 2 | Extreme (and mean) temperature changes associated with a The respective scaling expressed as ratio of global mean temperature 
2°C target. Local changes associated with a global warming of 2°C increase is provided in Supplementary Fig. 1. Note that very similar 
are shown for hottest daytime temperature (Txx) (a), annual coldest results are obtained with the RCP4.5 scenario simulations (Fig. 3 and 
night-time temperature (Tn) (b), and mean temperature (Tmean) (€). Supplementary Figs 2 and 3, based on 22 model simulations). Panels a 
The analysis is based on RCP8.5 scenario simulations (ensemble average and b also display the outlines of the regions analysed in Fig. 3. 


year 2044; based on 25 model simulations, see Supplementary Table 1). 


the coldest extremes than for the global mean temperature change, withan _ USA, and only by the mid-2040s for the global mean temperature, under 
increase of about 5.5°C for the 2 °C global warming target. In addition, itis _ the business-as-usual (unchecked) emissions scenario (RCP8.5, which leads 
evident that a regional 2°C threshold was passed in the simulations around __ toa radiative forcing of 8.5 W m~* by 2100 relative to pre-industrial values). 
the year 2000 for Tyn in the Arctic, while it is projected to be reached by For a 1.5°C global warming target, we also note that substantial regional 
about 2030 for Tx, in the Mediterranean, central Brazil and the contiguous _ changes in temperature extremes would still occur, with (for example) a 
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Figure 3 | Scaling between regional changes in annual temperature 
extremes and changes in global mean temperature, with associated 
global cumulative CO, emissions targets. See Box 1 for details on the 
underlying analysis. Results are shown for annual maximum daytime 
temperature (Tx) in the Mediterranean region (30° to 45° N, 10° W to 
45° E) (a), the contiguous USA (25° to 50° N, 125° W to 67° W) (b), and 
central Brazil (30° S to 0° N, 65° W to 50° W) (c), and for the annual 
minimum night-time temperature (Typ) in the Arctic (65° to 90° N, 180° W 
to 180° E) (d). The four analysed regions are indicated in Fig. 2a and b. 
The solid black line denotes the ensemble average in the historical runs 


4.4°C warming in Typ in the Arctic and a 2.2°C increase in Tx, in the 
Mediterranean region (Fig. 3). 

Although we have illustrated the concept of regional and impact-related 
climate targets with regional changes in temperature extremes, similar 
reasoning can be applied to a range of other responses to global climate 
forcing” (for example, changes in heavy precipitation events, see below). 
These are also highly relevant in comprehending the regional implications 
of global CO, emissions. As a further illustration, we display in Fig. 4 the 
scaling of heavy precipitation events with global mean temperature, and 
the respective relationship between cumulative CO emissions and result- 
ing changes in heavy precipitation in Southern Asia. As for regional tem- 
perature extremes, multi-model average changes in heavy precipitation 
display an almost linear scaling with the changes in global mean temper- 
ature”® (roughly consistent with the Clausius—Clapeyron relationship in 
that region), and thus could be used to provide regional decision-makers 
with suitable allowable targets for global emissions. 

Moreover, it should be noted that, while the ensemble mean response 
is robust across models and emissions scenarios for heavy precipitation 
events, individual model projections can diverge strongly from this 
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anomaly relative to 1861-1880 (°C) 
until 2010 (combined with RCP8.5 for 2006-2010) and the solid red (blue) 
line denotes the ensemble average of the future projections following 
the RCP8.5 (RCP4.5) scenario simulations, based on 25 (22) model 
simulations (see Supplementary Table 1). The red shaded area indicates the 
total range (minimum to maximum value) for all considered simulations 
and experiments. The dashed black line shows the identity (1:1) line. Grey 
dashed lines show the temperatures or CO emissions associated with 2°C 
increases in global mean and regional extreme temperatures, respectively. 
Note the different vertical axes for Tx, and Tyn. Only land grid cells were 
used for calculating the regional Tx, and Typ averages. 


mean response (in the region investigated as well as in other locations; 
see Supplementary Figs 6 and 7). This is obvious from the red-shaded 
uncertainty range in Fig. 4b and Supplementary Figs 6 and 7, which is 
substantially larger in most regions than for temperature extremes. This 
behaviour is due to the increasing relevance of internal climate varia- 
bility at the regional-to-local scale’’, higher model uncertainty, and the 
spatially more heterogeneous nature of precipitation extremes compared 
to temperature extremes. 

Despite the associated uncertainty, analyses such as the ones in Figs 3 
and 4b provide more information to regional stakeholders than a global 
mean temperature target, since they quantitatively and directly highlight the 
expected regional response (in extremes and other variables than tempera- 
ture), with attendant lower and upper bounds. Such estimates are thus more 
useful when assessing associated impacts, and engaging with policymakers. 


Limitations of approach 
Some caveats are attached to the above findings, most importantly: 

(1) Scaling relationships are only meaningful as long as associated 
uncertainties in projections are kept within reasonable bounds. This 


© 2016 Macmillan Publishers Limited. All rights reserved 


PERSPECTIVE | RESEARCH | 


re Kant ee Cnn el Ye Dr 

a b J L 

40 L 

g | [ 

S 304 - 

© J L 

o a 1 [ 

xe} oO | r 

2 2 ac] ; 

a 2 205 [ 
o 

Zz ‘] r 

3 4 L 

£ i0- a 

gs 4 [ 

a 7 Cumulative total CO, i 

I : emissions from 1870 (Gt C) [ 

oa 980 500 1 000 1,500 2,000 L 

J . "i 5 mf 7 5 9 < a 7 ° 6 TT T Cee poy ee | ha oy ee 
180° 150°W 120°W 90°W 60°W 30°W 0 30°E 60°E 90°E 120°E 150°E 180 0.0 1.0 20 3.0 4.0 5.0 
Longitude 


-6 -3 0 3 6 9 12 
Percentage change in Rysday (% per °C) 


Figure 4 | Scaling of 5-day heavy precipitation events with global 
mean temperature changes, with associated global cumulative CO, 
emissions targets. See Box 1 for details on the underlying analysis. a, Map 
of the ratio of percentage changes in heavy precipitation events (annual 
maximum consecutive 5-day precipitation, Rysday) with changes in global 
mean temperature ATyiop for the RCP8.5 scenario simulations (ensemble 
average ratio ARxsday/AT gob). AT giob and ARxsday were calculated 

from each model run as the difference between the average of the first 
(1861-1880) and last (2080-2099) 20-year time slices. b, Scaling of 
percentage changes in Rxsday in Southern Asia (10° to 30° N, 60° to 

110° E; see outlined box in a) with global mean temperature changes 


is the case for some climate features, such as temperature extremes or 
heavy precipitation events’, but for others, such as droughts, tropical 
cyclones or storms, uncertainties are generally larger than the climate 
change signals’””*, In such situations, no emissions target (or implied 
global temperature target) may currently be set on the basis of avoiding 
changes in these extremes. 

(2) Some changes in the climate system may be abrupt (that is, non- 
linearly related to emissions) owing to tipping points”. Again, uncer- 
tainties in the associated projections are very large, especially under 
high-end emissions. Owing to the nonlinearity of the respective features, 
relationships could be difficult to derive (although some features have 
been assessed, such as the dependency of mean sea level rise on global 
mean temperature increase at equilibrium” and the probability of abrupt 
changes for given global temperature thresholds*’). 

(3) Although we find a relatively robust scaling of regional temper- 
ature and precipitation extremes with AT,).4, we can expect that the 
reliability of scaling will diminish at increasingly smaller scales owing 
to internal climate variability?”*? and a larger contribution of local pro- 
cesses to the response (including by local land surface and human forc- 
ing, see point (5) below). 

(4) It is likely that climate models share common biases for some 
regional climate phenomena**-**. In this case, scaling features could be 
derived, but would be erroneous, an issue that would need to be exam- 
ined with careful model evaluation*”** contingent on the availability of 
appropriate observations. 

(5) The relationship between changes in regional climate and AT,i4 
would be expected to alter in the presence of time-varying local forc- 
ing by, for example, aerosols*’, land-use and land-cover change*?-*?, 
urban development, or human water use***°, These effects are likely 
to be important on the local scale, but less so for the larger regions con- 
sidered here (see Figs 3 and 4 and the regions from the IPCC Special 
Report on Extremes’ in the Supplementary Information). 

(6) The ranges in Figs 3 and 4b reflect the uncertainty in the scaling 
of the regional quantities with AT,),p, but do not include uncertainties 


Global mean temperature anomaly 
relative to 1861-1880 (°C) 


and cumulative global CO, emissions. The solid black line denotes 

the ensemble average in the historical runs until 2010 (combined with 
RCP8.5 for 2006-2010), and the solid red (blue) line denotes the ensemble 
average of the future projections following the RCP8.5 (RCP4.5) scenario 
simulations, based on 25 (22) model simulations (see Supplementary 
Table 1). The red shaded area indicates the total range (minimum to 
maximum value) for all considered simulations and experiments. 

Grey dashed lines show the percentage change in Rxsday or CO2 emissions 
associated with a 2°C increase in global mean temperature. Only land grid 
cells were used for calculating the regional Rxsday average. 


associated with the scaling of AT ip with the cumulative CO) emis- 
sions (Fig. 1). This additional uncertainty source is also relevant for the 
decision process when assessing regional climate targets (as is the case 
for climate targets based on the global mean temperature). For a given 
impact threshold, the uncertainty in the cumulative carbon would be 
wider, and as a consequence the cumulative carbon budget would be 
smaller if the desire were to avoid the impact with high probability’. 
More in-depth analyses of the CMIP5 archive would help determine 
the total uncertainty range when directly relating imposed greenhouse 
gas forcing to simulated regional extremes. 


Using regional targets in decision making 

We focus here on regional changes because local stakeholders and 
decision-makers are more likely to be able to relate to them than to global 
mean temperature changes. However, we stress that this does not imply 
that countries should only be concerned about climate changes affecting 
them directly in a geographical sense. Indeed, because of globalization, 
major climate disruptions in some countries can strongly affect others, for 
instance owing to political unrest, migration, impacts on global food pro- 
duction, supply chains and trade’***4”, Even when not directly affected 
by such changes, individual countries are more likely to understand the 
implications of climate targets for other parties if they can more readily 
quantify the specific implications for different regions. This could also 
help pave the way to solutions that integrate both climate mitigation and 
adaptation within climate negotiations, by incorporating the costs of 
impacts into negotiations. Global temperature targets that differ from 
and are possibly lower than 2°C (such as 1.5°C)*8-°° may thus well be 
desirable on the basis of inferred regional climate targets. 

Linking cumulative CO; emission targets to regional consequences, 
such as changing climate extremes, would be of particular benefit for 
political decision-making, both in the context of climate negotiations 
and of adaptation. We stress that the quantification of regional targets 
will not necessarily imply that all involved parties will agree on a suit- 
able (and common) cumulative global CO, emission target. However, 
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regional information can help in the development of solutions and 
in communication with the public. Similarly robust regional scaling 
might be expected for other features of the climate system beside those 
considered here®!’, and could be explored for impact-based simula- 
tions?*-°. Indeed, such relationships can be determined for any regional 
or impact-relevant climate feature that scales robustly with changes in 
global mean temperature (or is at least monotonically related to it), and 
that is not associated with larger uncertainty ranges or biases in current 
climate models. 

In view of the inherent model uncertainty and to avoid possible risks 
associated with the indiscriminate use of such information, we recom- 
mend that IPCC-calibrated language be applied when assessing the 
confidence of any such derived relationships, with only situations of 
‘high confidence justifying the derivation of quantitative estimates’. In 
addition to the requirement of levels of high confidence, a high signal- 
to-noise model ratio (traditionally referred to in ‘likelihood’ terms in the 
language’ of the IPCC) is a prerequisite for deriving meaningful allowable 
CO, emissions ranges. Furthermore, any assessment of projected changes 
in climate risks and impacts also needs to consider the contributions of 
changes in vulnerability and exposure of human and natural systems to 
those climate hazards”’. Bearing in mind these requirements, quantita- 
tive tools for decision-making that relate regional (or even country-scale) 
impacts to global CO, emissions targets could be one way of advanc- 
ing climate negotiations by exposing what is at stake in a more local 
manner. 
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Mastering the game of Go with deep 
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The game of Go has long been viewed as the most challenging of classic games for artificial intelligence owing to its 
enormous search space and the difficulty of evaluating board positions and moves. Here we introduce a new approach 
to computer Go that uses ‘value networks’ to evaluate board positions and ‘policy networks’ to select moves. These deep 
neural networks are trained by a novel combination of supervised learning from human expert games, and reinforcement 
learning from games of self-play. Without any lookahead search, the neural networks play Go at the level of state- 
of-the-art Monte Carlo tree search programs that simulate thousands of random games of self-play. We also introduce a 
new search algorithm that combines Monte Carlo simulation with value and policy networks. Using this search algorithm, 
our program AlphaGo achieved a 99.8°% winning rate against other Go programs, and defeated the human European Go 
champion by 5 games to 0. This is the first time that a computer program has defeated a human professional player in the 
full-sized game of Go, a feat previously thought to be at least a decade away. 


All games of perfect information have an optimal value function, v (s), 
which determines the outcome of the game, from every board position 
or state s, under perfect play by all players. These games may be solved 
by recursively computing the optimal value function in a search tree 
containing approximately b4 possible sequences of moves, where b is 
the game's breadth (number of legal moves per position) and d is its 
depth (game length). In large games, such as chess (b~35, d~ 80)! and 
especially Go (b= 250, d= 150)!, exhaustive search is infeasible”, but 
the effective search space can be reduced by two general principles. 
First, the depth of the search may be reduced by position evaluation: 
truncating the search tree at state s and replacing the subtree below s 
by an approximate value function v(s) ~ v'(s) that predicts the outcome 
from state s. This approach has led to superhuman performance in 
chess‘, checkers® and othello®, but it was believed to be intractable in Go 
due to the complexity of the game’. Second, the breadth of the search 
may be reduced by sampling actions from a policy p(a|s) that is a prob- 
ability distribution over possible moves a in position s. For example, 
Monte Carlo rollouts® search to maximum depth without branching 
at all, by sampling long sequences of actions for both players from a 
policy p. Averaging over such rollouts can provide an effective position 
evaluation, achieving superhuman performance in backgammon® and 
Scrabble’, and weak amateur level play in Go!®, 

Monte Carlo tree search (MCTS)!"!? uses Monte Carlo rollouts 
to estimate the value of each state in a search tree. As more simu- 
lations are executed, the search tree grows larger and the relevant 
values become more accurate. The policy used to select actions during 
search is also improved over time, by selecting children with higher 
values. Asymptotically, this policy converges to optimal play, and the 
evaluations converge to the optimal value function'’. The strongest 
current Go programs are based on MCTS, enhanced by policies that 
are trained to predict human expert moves!3. These policies are used 
to narrow the search to a beam of high-probability actions, and to 
sample actions during rollouts. This approach has achieved strong 
amateur play'?-!°. However, prior work has been limited to shallow 


policies!*-1> or value functions!® based on a linear combination of 


input features. 

Recently, deep convolutional neural networks have achieved unprec- 
edented performance in visual domains: for example, image classifica- 
tion'’, face recognition'’, and playing Atari games!*. They use many 
layers of neurons, each arranged in overlapping tiles, to construct 
increasingly abstract, localized representations of an image”’. We 
employ a similar architecture for the game of Go. We pass in the board 
position as a 19 x 19 image and use convolutional layers to construct a 
representation of the position. We use these neural networks to reduce 
the effective depth and breadth of the search tree: evaluating positions 
using a value network, and sampling actions using a policy network. 

We train the neural networks using a pipeline consisting of several 
stages of machine learning (Fig. 1). We begin by training a supervised 
learning (SL) policy network p, directly from expert human moves. 
This provides fast, efficient learning updates with immediate feedback 
and high-quality gradients. Similar to prior work'*!°, we also train a 
fast policy p, that can rapidly sample actions during rollouts. Next, we 
train a reinforcement learning (RL) policy network p, that improves 
the SL policy network by optimizing the final outcome of games of self- 
play. This adjusts the policy towards the correct goal of winning games, 
rather than maximizing predictive accuracy. Finally, we train a value 
network vg that predicts the winner of games played by the RL policy 
network against itself. Our program AlphaGo efficiently combines the 
policy and value networks with MCTS. 


Supervised learning of policy networks 

For the first stage of the training pipeline, we build on prior work 
on predicting expert moves in the game of Go using supervised 
learning!?”!*. The SL policy network p,(a|s) alternates between con- 
volutional layers with weights o, and rectifier nonlinearities. A final soft- 
max layer outputs a probability distribution over all legal moves a. The 
input s to the policy network is a simple representation of the board state 
(see Extended Data Table 2). The policy network is trained on randomly 
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Human expert positions 
Figure 1 | Neural network training pipeline and architecture. a, A fast 
rollout policy p, and supervised learning (SL) policy network p, are 
trained to predict human expert moves in a data set of positions. 
A reinforcement learning (RL) policy network p, is initialized to the SL 
policy network, and is then improved by policy gradient learning to 
maximize the outcome (that is, winning more games) against previous 
versions of the policy network. A new data set is generated by playing 
games of self-play with the RL policy network. Finally, a value network vg 
is trained by regression to predict the expected outcome (that is, whether 


sampled state-action pairs (s, a), using stochastic gradient ascent to 
maximize the likelihood of the human move a selected in state s 


dlog p,(a|s) 
x ee eee 
Oo 


We trained a 13-layer policy network, which we call the SL policy 
network, from 30 million positions from the KGS Go Server. The net- 
work predicted expert moves on a held out test set with an accuracy of 
57.0% using all input features, and 55.7% using only raw board posi- 
tion and move history as inputs, compared to the state-of-the-art from 
other research groups of 44.4% at date of submission” (full results in 
Extended Data Table 3). Small improvements in accuracy led to large 
improvements in playing strength (Fig. 2a); larger networks achieve 
better accuracy but are slower to evaluate during search. We also 
trained a faster but less accurate rollout policy p,(a|s), using a linear 
softmax of small pattern features (see Extended Data Table 4) with 
weights 7; this achieved an accuracy of 24.2%, using just 21s to select 
an action, rather than 3 ms for the policy network. 


Ao 


Reinforcement learning of policy networks 

The second stage of the training pipeline aims at improving the policy 
network by policy gradient reinforcement learning (RL)”°”°. The RL 
policy network p, is identical in structure to the SL policy network, 
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Figure 2 | Strength and accuracy of policy and value networks. 

a, Plot showing the playing strength of policy networks as a function 

of their training accuracy. Policy networks with 128, 192, 256 and 384 
convolutional filters per layer were evaluated periodically during training; 
the plot shows the winning rate of AlphaGo using that policy network 


against the match version of AlphaGo. b, Comparison of evaluation 
accuracy between the value network and rollouts with different policies. 


Self-play positions 


the current player wins) in positions from the self-play data set. 

b, Schematic representation of the neural network architecture used in 
AlphaGo. The policy network takes a representation of the board position 
s as its input, passes it through many convolutional layers with parameters 
a (SL policy network) or p (RL policy network), and outputs a probability 
distribution p,(a|s) or p,(a|s) over legal moves a, represented by a 
probability map over the board. The value network similarly uses many 
convolutional layers with parameters 0, but outputs a scalar value vo(s’) 
that predicts the expected outcome in position s’. 


and its weights p are initialized to the same values, p =o. We play 
games between the current policy network p, and a randomly selected 
previous iteration of the policy network. Randomizing from a pool 
of opponents in this way stabilizes training by preventing overfitting 
to the current policy. We use a reward function r(s) that is zero for all 
non-terminal time steps t << T. The outcome z;= + r(s7) is the termi- 
nal reward at the end of the game from the perspective of the current 
player at time step f: +1 for winning and —1 for losing. Weights are 
then updated at each time step ¢ by stochastic gradient ascent in the 
direction that maximizes expected outcome”? 


dlog p,(a1|s1) 
px—+___z, 
Op 


We evaluated the performance of the RL policy network in game 
play, sampling each move a,;~ p, (-|s;) from its output probability 
distribution over actions. When played head-to-head, the RL policy 
network won more than 80% of games against the SL policy network. 
We also tested against the strongest open-source Go program, Pachi'4, 
a sophisticated Monte Carlo search program, ranked at 2 amateur dan 
on KGS, that executes 100,000 simulations per move. Using no search 
at all, the RL policy network won 85% of games against Pachi. In com- 
parison, the previous state-of-the-art, based only on supervised 
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Positions and outcomes were sampled from human expert games. Each 
position was evaluated by a single forward pass of the value network vg, 

or by the mean outcome of 100 rollouts, played out using either uniform 
random rollouts, the fast rollout policy p,, the SL policy network p, or 

the RL policy network p,. The mean squared error between the predicted 
value and the actual game outcome is plotted against the stage of the game 
(how many moves had been played in the given position). 
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Figure 3 | Monte Carlo tree search in AlphaGo. a, Each simulation 
traverses the tree by selecting the edge with maximum action value Q, 
plus a bonus u(P) that depends on a stored prior probability P for that 
edge. b, The leaf node may be expanded; the new node is processed once 
by the policy network p, and the output probabilities are stored as prior 
probabilities P for each action. c, At the end of a simulation, the leaf node 


learning of convolutional networks, won 11% of games against Pachi*? 
and 12% against a slightly weaker program, Fuego”. 


Reinforcement learning of value networks 

The final stage of the training pipeline focuses on position evaluation, 
estimating a value function v?(s) that predicts the outcome from posi- 
tion s of games played by using policy p for both players***° 


vP(s)=Elz,|si=s, ay... r~p] 


Ideally, we would like to know the optimal value function under 
perfect play v"(s); in practice, we instead estimate the value function 
v’ for our strongest policy, using the RL policy network p,. We approx- 
imate the value function using a value network v¢(s) with weights 0, 
vo(s) &v¥(s) &v*(s). This neural network has a similar architecture 
to the policy network, but outputs a single prediction instead of a prob- 
ability distribution. We train the weights of the value network by regres- 
sion on state-outcome pairs (s, z), using stochastic gradient descent to 
minimize the mean squared error (MSE) between the predicted value 
vo(s), and the corresponding outcome z 


Ovo(s) 


Ad x ———(z- 
06 


va(s)) 


The naive approach of predicting game outcomes from data con- 
sisting of complete games leads to overfitting. The problem is that 
successive positions are strongly correlated, differing by just one stone, 
but the regression target is shared for the entire game. When trained 
on the KGS data set in this way, the value network memorized the 
game outcomes rather than generalizing to new positions, achieving a 
minimum MSE of 0.37 on the test set, compared to 0.19 on the training 
set. To mitigate this problem, we generated a new self-play data set 
consisting of 30 million distinct positions, each sampled from a sepa- 
rate game. Each game was played between the RL policy network and 
itself until the game terminated. Training on this data set led to MSEs 
of 0.226 and 0.234 on the training and test set respectively, indicating 
minimal overfitting. Figure 2b shows the position evaluation accuracy 
of the value network, compared to Monte Carlo rollouts using the fast 
rollout policy p,; the value function was consistently more accurate. 
A single evaluation of v¢(s) also approached the accuracy of Monte 
Carlo rollouts using the RL policy network p,, but using 15,000 times 
less computation. 


Searching with policy and value networks 
AlphaGo combines the policy and value networks in an MCTS algo- 
rithm (Fig. 3) that selects actions by lookahead search. Each edge 


486 | NATURE | VOL 529 | 28 JANUARY 2016 


Evaluation 


is evaluated in two ways: using the value network vg; and by running 

a rollout to the end of the game with the fast rollout policy p,, then 
computing the winner with function r. d, Action values Q are updated to 
track the mean value of all evaluations r(-) and vo(-) in the subtree below 
that action. 


(s, a) of the search tree stores an action value Q(s, a), visit count N(s, a), 
and prior probability P(s, a). The tree is traversed by simulation (that 
is, descending the tree in complete games without backup), starting 
from the root state. At each time step ¢ of each simulation, an action a; 
is selected from state s; 


a, =argmax(Q(s,,a) + u(s,,a)) 


a 


so as to maximize action value plus a bonus 


that is proportional to the prior probability but decays with 
repeated visits to encourage exploration. When the traversal reaches a 
leaf node s; at step L, the leaf node may be expanded. The leaf position 
sy is processed just once by the SL policy network p,. The output prob- 
abilities are stored as prior probabilities P for each legal action a, 
P(s,a) =p,(a|s). The leaf node is evaluated in two very different ways: 
first, by the value network vo(s;); and second, by the outcome z; of a 
random rollout played out until terminal step T using the fast rollout 
policy p,; these evaluations are combined, using a mixing parameter 
A, into a leaf evaluation V(s;) 


V(s,) =(1—A)vols) + Azz 


At the end of simulation, the action values and visit counts of all 
traversed edges are updated. Each edge accumulates the visit count and 
mean evaluation of all simulations passing through that edge 


where si is the leaf node from the ith simulation, and 1(s, a, i) indicates 
whether an edge (s, a) was traversed during the ith simulation. Once 
the search is complete, the algorithm chooses the most visited move 
from the root position. 

It is worth noting that the SL policy network p, performed better in 
AlphaGo than the stronger RL policy network p,, presumably because 
humans select a diverse beam of promising moves, whereas RL opti- 
mizes for the single best move. However, the value function 
vo(s) vs) derived from the stronger RL policy network performed 
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Figure 4 | Tournament evaluation of AlphaGo. a, Results of a 
tournament between different Go programs (see Extended Data Tables 
6-11). Each program used approximately 5s computation time per move. 
To provide a greater challenge to AlphaGo, some programs (pale upper 
bars) were given four handicap stones (that is, free moves at the start of 
every game) against all opponents. Programs were evaluated on an 

Elo scale*”: a 230 point gap corresponds to a 79% probability of winning, 
which roughly corresponds to one amateur dan rank advantage on 
KGS**; an approximate correspondence to human ranks is also shown, 


better in AlphaGo than a value function v(s) ~ v’-(s) derived from the 
SL policy network. 

Evaluating policy and value networks requires several orders of 
magnitude more computation than traditional search heuristics. To 
efficiently combine MCTS with deep neural networks, AlphaGo uses 
an asynchronous multi-threaded search that executes simulations on 
CPUs, and computes policy and value networks in parallel on GPUs. 
The final version of AlphaGo used 40 search threads, 48 CPUs, and 
8 GPUs. We also implemented a distributed version of AlphaGo that 


a Value network 


b Tree evaluation from value net 
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horizontal lines show KGS ranks achieved online by that program. Games 
against the human European champion Fan Hui were also included; 

these games used longer time controls. 95% confidence intervals are 
shown. b, Performance of AlphaGo, on a single machine, for different 
combinations of components. The version solely using the policy network 
does not perform any search. ¢, Scalability study of MCTS in AlphaGo 
with search threads and GPUs, using asynchronous search (light blue) or 
distributed search (dark blue), for 2s per move. 


exploited multiple machines, 40 search threads, 1,202 CPUs and 
176 GPUs. The Methods section provides full details of asynchronous 
and distributed MCTS. 


Evaluating the playing strength of AlphaGo 

To evaluate AlphaGo, we ran an internal tournament among variants 
of AlphaGo and several other Go programs, including the strongest 
commercial programs Crazy Stone’? and Zen, and the strongest open 
source programs Pachi'* and Fuego’». All of these programs are based 


Cc Tree evaluation from rollouts 


Figure 5 | How AlphaGo (black, to play) selected its move in an 
informal game against Fan Hui. For each of the following statistics, 
the location of the maximum value is indicated by an orange circle. 

a, Evaluation of all successors s’ of the root position s, using the value 
network v¢(s’); estimated winning percentages are shown for the top 
evaluations. b, Action values Q(s, a) for each edge (s, a) in the tree from 
root position s; averaged over value network evaluations only (A=0). 

c, Action values Q(s, a), averaged over rollout evaluations only (A= 1). 


d, Move probabilities directly from the SL policy network, p (a|s); 
reported as a percentage (if above 0.1%). e, Percentage frequency with 
which actions were selected from the root during simulations. f, The 
principal variation (path with maximum visit count) from AlphaGo's 
search tree. The moves are presented in a numbered sequence. AlphaGo 
selected the move indicated by the red circle; Fan Hui responded with the 
move indicated by the white square; in his post-game commentary he 
preferred the move (labelled 1) predicted by AlphaGo. 
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Game 1 
Fan Hui (Black), AlphaGo (White) 
AlphaGo wins by 2.5 points 


Game 2 


AlphaGo (Black), Fan Hui (White) 
AlphaGo wins by resignation 


Game 3 
Fan Hui (Black), AlphaGo (White) 
AlphaGo wins by resignation 
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Game 4 
AlphaGo (Black), Fan Hui (White) 
AlphaGo wins by resignation 


See)? 47) 


Game 5 
Fan Hui (Black), AlphaGo (White) 
AlphaGo wins by resignation 


@@ 


Figure 6 | Games from the match between AlphaGo and the European 
champion, Fan Hui. Moves are shown in a numbered sequence 
corresponding to the order in which they were played. Repeated moves 
on the same intersection are shown in pairs below the board. The first 


on high-performance MCTS algorithms. In addition, we included the 
open source program GnuGo, a Go program using state-of-the-art 
search methods that preceded MCTS. All programs were allowed 5 s 
of computation time per move. 

The results of the tournament (see Fig. 4a) suggest that single- 
machine AlphaGo is many dan ranks stronger than any previous 
Go program, winning 494 out of 495 games (99.8%) against other 
Go programs. To provide a greater challenge to AlphaGo, we also 
played games with four handicap stones (that is, free moves for the 
opponent); AlphaGo won 77%, 86%, and 99% of handicap games 
against Crazy Stone, Zen and Pachi, respectively. The distributed ver- 
sion of AlphaGo was significantly stronger, winning 77% of games 
against single-machine AlphaGo and 100% of its games against other 
programs. 

We also assessed variants of AlphaGo that evaluated positions 
using just the value network (A =0) or just rollouts (A= 1) (see 
Fig. 4b). Even without rollouts AlphaGo exceeded the performance 
of all other Go programs, demonstrating that value networks provide 
a viable alternative to Monte Carlo evaluation in Go. However, the 
mixed evaluation (A =0.5) performed best, winning >95% of games 
against other variants. This suggests that the two position-evaluation 
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move number in each pair indicates when the repeat move was played, at 
an intersection identified by the second move number (see Supplementary 
Information). 


mechanisms are complementary: the value network approximates the 
outcome of games played by the strong but impractically slow p,, while 
the rollouts can precisely score and evaluate the outcome of games 
played by the weaker but faster rollout policy p,. Figure 5 visualizes 
the evaluation of a real game position by AlphaGo. 

Finally, we evaluated the distributed version of AlphaGo against Fan 
Hui, a professional 2 dan, and the winner of the 2013, 2014 and 2015 
European Go championships. Over 5-9 October 2015 AlphaGo and 
Fan Hui competed in a formal five-game match. AlphaGo won the 
match 5 games to 0 (Fig. 6 and Extended Data Table 1). This is the 
first time that a computer Go program has defeated a human profes- 
sional player, without handicap, in the full game of Go—a feat that was 
previously believed to be at least a decade away*”*!. 


Discussion 

In this work we have developed a Go program, based on a combina- 
tion of deep neural networks and tree search, that plays at the level of 
the strongest human players, thereby achieving one of artificial intel- 
ligence’s “grand challenges”?!~*°. We have developed, for the first time, 
effective move selection and position evaluation functions for Go, 
based on deep neural networks that are trained by a novel combination 
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of supervised and reinforcement learning. We have introduced a new 
search algorithm that successfully combines neural network evalu- 
ations with Monte Carlo rollouts. Our program AlphaGo integrates 
these components together, at scale, in a high-performance tree search 
engine. 

During the match against Fan Hui, AlphaGo evaluated thousands 
of times fewer positions than Deep Blue did in its chess match against 
Kasparov’; compensating by selecting those positions more intelli- 
gently, using the policy network, and evaluating them more precisely, 
using the value network—an approach that is perhaps closer to how 
humans play. Furthermore, while Deep Blue relied on a handcrafted 
evaluation function, the neural networks of AlphaGo are trained 
directly from gameplay purely through general-purpose supervised 
and reinforcement learning methods. 

Go is exemplary in many ways of the difficulties faced by artificial 
intelligence*?**; a challenging decision-making task, an intractable 
search space, and an optimal solution so complex it appears infeasible 
to directly approximate using a policy or value function. The previous 
major breakthrough in computer Go, the introduction of MCTS, led to 
corresponding advances in many other domains; for example, general 
game-playing, classical planning, partially observed planning, sched- 
uling, and constraint satisfaction****. By combining tree search with 
policy and value networks, AlphaGo has finally reached a professional 
level in Go, providing hope that human-level performance can now be 
achieved in other seemingly intractable artificial intelligence domains. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 


Problem setting. Many games of perfect information, such as chess, checkers, 
othello, backgammon and Go, may be defined as alternating Markov games™. In 
these games, there is a state space S (where state includes an indication of the 
current player to play); an action space A(s) defining the legal actions in any given 
state s € S; a state transition function f(s, a, €) defining the successor state after 
selecting action a in state s and random input € (for example, dice); and finally a 
reward function r‘(s) describing the reward received by player i in state s. We 
restrict our attention to two-player zero-sum games, r'(s) =—?(s) =1(s), with 
deterministic state transitions, f(s, a, ) =f(s, a), and zero rewards except at a ter- 
minal time step T. The outcome of the game z;= +r(s7) is the terminal reward at 
the end of the game from the perspective of the current player at time step ¢. 
A policy p(a|s) is a probability distribution over legal actions a € A(s). 
A value function is the expected outcome if all actions for both players are selected 
according to policy p, that is, v?(s) = E[z;|s; = s, a;...r ~ p]. Zero-sum games have 
a unique optimal value function v*(s) that determines the outcome from state s 
following perfect play by both players, 


Zr if s=sr, 


v*(s)= 


max — v*(f(s,a)) otherwise 


Prior work. The optimal value function can be computed recursively by minimax 
(or equivalently negamax) search’. Most games are too large for exhaustive min- 
imax tree search; instead, the game is truncated by using an approximate value 
function v(s) © v"(s) in place of terminal rewards. Depth-first minimax search with 
alpha-beta pruning” has achieved superhuman performance in chess‘, checkers? 
and othello®, but it has not been effective in Go’. 

Reinforcement learning can learn to approximate the optimal value function 
directly from games of self-play*’. The majority of prior work has focused on a 
linear combination v4(s) = y(s) - 6 of features y(s) with weights 6. Weights were 
trained using temporal-difference learning”! in chess**’, checkers“** and Go™; 
or using linear regression in othello® and Scrabble’. Temporal-difference learning 
has also been used to train a neural network to approximate the optimal value 
function, achieving superhuman performance in backgammon"; and achiev- 
ing weak kyu-level performance in small-board Go**”?-*” using convolutional 
networks. 

An alternative approach to minimax search is Monte Carlo tree search 
(MCTS)!"!?, which estimates the optimal value of interior nodes by a double 
approximation, V"(s) = v?"(s) = v*(s). The first approximation, V"(s) & v?"(s), 
uses n Monte Carlo simulations to estimate the value function of a simulation 
policy P”. The second approximation, v?’"(s)  v*(s), uses a simulation policy P" 
in place of minimax optimal actions. The simulation policy selects actions accord- 
ing toa search control function argmax, (Q"(s,a) + u(s,a)) such as UCT”, that 
selects children with higher action values, Q"(s, a) = —V"(f(s, a)), plus a bonus 
u(s, a) that encourages exploration; or in the absence of a search tree at state s, it 
samples actions from a fast rollout policy p_(a|s). As more simulations are executed 
and the search tree grows deeper, the simulation policy becomes informed by 
increasingly accurate statistics. In the limit, both approximations become exact 
and MCTS (for example, with UCT) converges!” to the optimal value function 
lim, .0V"(s) = lim,_...v?"(s) = v*(s). The strongest current Go programs are 
based on MCTS!*-19°, 

MCTS has previously been combined with a policy that is used to narrow the 
beam of the search tree to high-probability moves!; or to bias the bonus term 
towards high-probability moves*®. MCTS has also been combined with a value 
function that is used to initialize action values in newly expanded nodes", or to 
mix Monte Carlo evaluation with minimax evaluation*”. By contrast, AlphaGo’s use 
of value functions is based on truncated Monte Carlo search algorithms®”, which 
terminate rollouts before the end of the game and use a value function in place of 
the terminal reward. AlphaGos position evaluation mixes full rollouts with trun- 
cated rollouts, resembling in some respects the well-known temporal-difference 
learning algorithm TD(,). AlphaGo also differs from prior work by using slower 
but more powerful representations of the policy and value function; evaluating 
deep neural networks is several orders of magnitude slower than linear representa- 
tions and must therefore occur asynchronously. 

The performance of MCTS is to a large degree determined by the quality of the 
rollout policy. Prior work has focused on handcrafted patterns® or learning rollout 
policies by supervised learning’, reinforcement learning", simulation balanc- 
ing? or online adaptation*”°?; however, it is known that rollout-based position 
evaluation is frequently inaccurate**. AlphaGo uses relatively simple rollouts, and 
instead addresses the challenging problem of position evaluation more directly 
using value networks. 


Search algorithm. To efficiently integrate large neural networks into AlphaGo, we 
implemented an asynchronous policy and value MCTS algorithm (APV-MCTS). 
Each node s in the search tree contains edges (s, a) for all legal actions a € A(s). 
Each edge stores a set of statistics, 


{P(s,a), N,(s,a), N,(s,a), Wy(s,a), W,(s,a), Q/(s,a)} 


where P(s, a) is the prior probability, W,(s, a) and W,(s, a) are Monte Carlo esti- 
mates of total action value, accumulated over N,(s, a) and N,(s, a) leaf evaluations 
and rollout rewards, respectively, and Q(s, a) is the combined mean action value for 
that edge. Multiple simulations are executed in parallel on separate search threads. 
The APV-MCTS algorithm proceeds in the four stages outlined in Fig. 3. 

Selection (Fig. 3a). The first in-tree phase of each simulation begins at the root of 
the search tree and finishes when the simulation reaches a leaf node at time step 
L. At each of these time steps, t < L, an action is selected according to the statistics 


in the search tree, a, = argmax, (Q(s;,@) + u(s;, a)) using a variant of the PUCT 
JEoNAsb) 
1+N,(s,a) 

the level of exploration; this search control strategy initially prefers actions with 
high prior probability and low visit count, but asymptotically prefers actions with 
high action value. 

Evaluation (Fig. 3c). The leaf position s, is added to a queue for evaluation vg(sz) 
by the value network, unless it has previously been evaluated. The second rollout 
phase of each simulation begins at leaf node s; and continues until the end of the 
game. At each of these time-steps, t > L, actions are selected by both players accord- 
ing to the rollout policy, a; ~ p_ (-|s;). When the game reaches a terminal state, the 
outcome z; = +r(sr) is computed from the final score. 
Backup (Fig. 3d). At each in-tree step t< L of the simulation, the rollout statistics 
are updated as if it has lost ny games, N,(s;, a+) — N,(s;, a1) + ns W,(s; ar) — W,(s1, 
a) —nys; this virtual loss°° discourages other threads from simultaneously explor- 
ing the identical variation. At the end of the simulation, t he rollout statistics are 
updated in a backward pass through each step t< L, replacing the virtual losses by 
the outcome, N,(s;, a1) — N;(Sr, ay) ~My + 1; W,(sp, a1) — We(Si, ar) + My + Zr. 
Asynchronously, a separate backward pass is initiated when the evaluation 
of the leaf position s, completes. The output of the value network v(s;) is used to 
update value statistics in a second backward pass through each step t< L, 
N,(sp 4) —N,(s, a) +1, W,(s, a)) — Ws; a) + vos). The overall evaluation of 
each state action is a weighted average of the Monte Carlo estimates, 


Wy(s, a) W,(s, a) 
Q(s.a) (1 ) Ny(s,a) + Nida) 


rollout evaluations with weighting parameter \. All updates are performed 
lock-free®. 

Expansion (Fig. 3b). When the visit count exceeds a threshold, N,(s, a) > ninr, the 
successor state s’ = f(s, a) is added to the search tree. The new node is initialized 
to {N(s’, a) =N,(s’, a) =0, W(s’, a) = W,(s’, a) =0, P(s',a) =p,(a|s’)}, using a tree 
policy p,(a|s’) (similar to the rollout policy but with more features, see Extended 
Data Table 4) to provide placeholder prior probabilities for action selection. The 
position s’ is also inserted into a queue for asynchronous GPU evaluation by the 
policy network. Prior probabilities are computed by the SL policy network pe (-|s’) 


algorithm’’, u(s, a) = CpuctP(s, a) , Where Cpuct is a constant determining 


, that mixes together the value network and 


with a softmax temperature set to (3; these replace the placeholder prior probabil- 
ities, P(s’, a) — pe (a|s’), using an atomic update. The threshold mp, is adjusted 
dynamically to ensure that the rate at which positions are added to the policy queue 
matches the rate at which the GPUs evaluate the policy network. Positions are 
evaluated by both the policy network and the value network using a mini-batch 
size of 1 to minimize end-to-end evaluation time. 

We also implemented a distributed APV-MCTS algorithm. This architecture 
consists of a single master machine that executes the main search, many remote 
worker CPUs that execute asynchronous rollouts, and many remote worker GPUs 
that execute asynchronous policy and value network evaluations. The entire search 
tree is stored on the master, which only executes the in-tree phase of each simu- 
lation. The leaf positions are communicated to the worker CPUs, which execute 
the rollout phase of simulation, and to the worker GPUs, which compute network 
features and evaluate the policy and value networks. The prior probabilities of the 
policy network are returned to the master, where they replace placeholder prior 
probabilities at the newly expanded node. The rewards from rollouts and the value 
network outputs are each returned to the master, and backed up the originating 
search path. 

At the end of search AlphaGo selects the action with maximum visit count; this 
is less sensitive to outliers than maximizing action value’. The search tree is reused 
at subsequent time steps: the child node corresponding to the played action 
becomes the new root node; the subtree below this child is retained along with all 
its statistics, while the remainder of the tree is discarded. The match version of 
AlphaGo continues searching during the opponent’s move. It extends the search 
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if the action maximizing visit count and the action maximizing action value disa- 
gree. Time controls were otherwise shaped to use most time in the middle-game””. 
AlphaGo resigns when its overall evaluation drops below an estimated 10% prob- 
ability of winning the game, that is, max, Q(s,a) < —0.8. 

AlphaGo does not employ the all-moves-as-first!” or rapid action value estima- 

tion®® heuristics used in the majority of Monte Carlo Go programs; when using 
policy networks as prior knowledge, these biased heuristics do not appear to give 
any additional benefit. In addition AlphaGo does not use progressive widening", 
dynamic komi°? or an opening book’. The parameters used by AlphaGo in the 
Fan Hui match are listed in Extended Data Table 5. 
Rollout policy. The rollout policy p_(a|s) is a linear softmax policy based on fast, 
incrementally computed, local pattern-based features consisting of both ‘response’ 
patterns around the previous move that led to state s, and ‘non-response patterns 
around the candidate move a in state s. Each non-response pattern is a binary 
feature matching a specific 3 x 3 pattern centred on a, defined by the colour (black, 
white, empty) and liberty count (1, 2, >3) for each adjacent intersection. Each 
response pattern is a binary feature matching the colour and liberty count in a 
12-point diamond-shaped pattern?! centred around the previous move. 
Additionally, a small number of handcrafted local features encode common-sense 
Go rules (see Extended Data Table 4). Similar to the policy network, the weights 
t of the rollout policy are trained from 8 million positions from human games on 
the Tygem server to maximize log likelihood by stochastic gradient descent. 
Rollouts execute at approximately 1,000 simulations per second per CPU thread 
on an empty board. 

Our rollout policy p,(a|s) contains less handcrafted knowledge than state- 
of-the-art Go programs". Instead, we exploit the higher-quality action selection 
within MCTS, which is informed both by the search tree and the policy network. 
We introduce a new technique that caches all moves from the search tree and 
then plays similar moves during rollouts; a generalization of the ‘last good reply’ 
heuristic’’. At every step of the tree traversal, the most probable action is inserted 
into a hash table, along with the 3 x 3 pattern context (colour, liberty and stone 
counts) around both the previous move and the current move. At each step of the 
rollout, the pattern context is matched against the hash table; if match is found 
then the stored move is played with high probability. 

Symmetries. In previous work, the symmetries of Go have been exploited by using 
rotationally and reflectionally invariant filters in the convolutional layers”*”5?. 
Although this may be effective in small neural networks, it actually hurts perfor- 
mance in larger networks, as it prevents the intermediate filters from identifying 
specific asymmetric patterns”*. Instead, we exploit symmetries at run-time by 
dynamically transforming each position s using the dihedral group of eight reflec- 
tions and rotations, d;(s), ..., dg(s). In an explicit symmetry ensemble, a mini-batch 
of all 8 positions is passed into the policy network or value network and computed 
in parallel. For the value network, the output values are simply averaged, 
Vo(s) = a ay ;-1¥6(d;(s)). For the policy network, the planes of output probabilities 
are rotated/reflected back into the anginal orientation, and averaged together to 
provide an ensemble prediction, p (-|s) = shy jad ‘o(- |dj(s))); this approach 
was used in our raw network evaluation (see Tfended Data Table 3). Instead, 
APV-MCTS makes use of an implicit symmetry ensemble that randomly selects a 
single rotation/reflection j € [1, 8] for each evaluation. We compute exactly one 
evaluation for that orientation only; in each simulation we compute the value 
of leaf node s; by vo(dj(s,)), and allow the search procedure to average over 
these evaluations. Similarly, we compute the policy network for a single, 
randomly selected rotation/reflection, dj ‘p, (-|dj(s))). 

Policy network: classification. We trained the policy network p, to classify posi- 
tions according to expert moves played in the KGS data set. This data set contains 
29.4 million positions from 160,000 games played by KGS 6 to 9 dan human play- 
ers; 35.4% of the games are handicap games. The data set was split into a test set 
(the first million positions) and a training set (the remaining 28.4 million posi- 
tions). Pass moves were excluded from the data set. Each position consisted of a 
raw board description s and the move a selected by the human. We augmented the 
data set to include all eight reflections and rotations of each position. Symmetry 
augmentation and input features were pre-computed for each position. For each 
training step, we sampled a pony selected mini-batch of m samples from 
the augmented KGS data set, {s*, aky” = , and applied an asynchronous stochastic 
gradient descent update to maximize the log likelihood of the action, 
Ao= - eA Blog palate!) The step size a was initialized to 0.003 and was halved 
every 80 million training steps, without momentum terms, and a mini-batch size 
of m= 16. Updates were applied asynchronously on 50 GPUs using DistBelief®!; 
gradients older than 100 steps were discarded. Training took around 3 weeks for 
340 million training steps. 
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Policy network: reinforcement learning. We further trained the policy network 
by policy gradient reinforcement learning””°. Each iteration consisted of a mini- 
batch of n games played in parallel, between the current policy network p, that is 
being trained, and an opponent Py that uses parameters p from a previous iter- 
ation, randomly sampled from a pool of opponents, so as to increase the stability 
of training. Weights were initialized to p= p- =o. Every 500 iterations, we added 
the current parameters p to the opponent pool. Each game i in the mini-batch was 
played out until termination at step T', and then scored to determine the outcome 
zi =+r(s,) from each player’s ue The games were then replayed to 


determine the policy gradient update, Ap = * 7 a A See v(s}))s 
n 2p 


using the REINFORCE algorithm”? with baseline v(s;) for variance reduction. On 
the first pass through the training pipeline, the baseline was set to zero; on the 
second pass we used the value network vg(s) as a baseline; this provided a small 
performance boost. The policy network was trained in this way for 10,000 mini- 
batches of 128 games, using 50 GPUs, for one day. 

Value network: regression. We trained a value network vg(s) ~ v/»(s) to approx- 
imate the value function of the RL policy network p,. To avoid overfitting to the 
strongly correlated positions within games, we constructed a new data set of uncor- 
related self-play positions. This data set consisted of over 30 million positions, each 
drawn from a unique game of self-play. Each game was generated in three phases 
by randomly sampling a time step U~ unif{1, 450}, and sampling the first t=1,... 
U—1 moves from the SL policy network, a;~ p,(-|s;); then sampling one move 
uniformly at random from available moves, ay ~ unif{1, 361} (repeatedly until 
avis legal); then sampling the remaining sequence of moves until the game termi- 
nates, f= U+1, ... T, from the RL policy network, a;~ p,(-|s;). Finally, the game 
is scored to determine the outcome z;= +r(s7). Only a single training example 
(su+1, Zu+1) is added to the data set from each game. This data provides unbiased 
samples of the value function vo(sy.1) = E[zu+i|su+1 4u41....7 ~?, ]. During 
the first two phases of generation we sample from noisier distributions so as 
to increase the diversity of the data set. The training method was identical 
to SL policy network training, except that the parameter update was based on 


mean squared error between the predicted values and the observed rewards, 
AO = © SH (zk — vy(st)) 2 
mini-batches of 32 positions, using 50 GPUs, for one week. 

Features for policy/value network. Each position s was pre-processed into a set 
of 19 x 19 feature planes. The features that we use come directly from the raw 
representation of the game rules, indicating the status of each intersection of the 
Go board: stone colour, liberties (adjacent empty points of stone's chain), captures, 
legality, turns since stone was played, and (for the value network only) the current 
colour to play. In addition, we use one simple tactical feature that computes the 
outcome of a ladder search’. All features were computed relative to the current 
colour to play; for example, the stone colour at each intersection was represented 
as either player or opponent rather than black or white. Each integer feature value 
is split into multiple 19 x 19 planes of binary values (one-hot encoding). For exam- 
ple, separate binary feature planes are used to represent whether an intersection 
has 1 liberty, 2 liberties,..., >8 liberties. The full set of feature planes are listed in 
Extended Data Table 2. 

Neural network architecture. The input to the policy network is a 19 x 19 x 48 
image stack consisting of 48 feature planes. The first hidden layer zero pads the 
input into a 23 x 23 image, then convolves k filters of kernel size 5 x 5 with stride 
1 with the input image and applies a rectifier nonlinearity. Each of the subsequent 
hidden layers 2 to 12 zero pads the respective previous hidden layer into a 21 x 21 
image, then convolves k filters of kernel size 3 x 3 with stride 1, again followed 
by a rectifier nonlinearity. The final layer convolves 1 filter of kernel size 1 x 1 
with stride 1, with a different bias for each position, and applies a softmax func- 
tion. The match version of AlphaGo used k= 192 filters; Fig. 2b and Extended 
Data Table 3 additionally show the results of training with k= 128, 256 and 
384 filters. 

The input to the value network is also a 19 x 19 x 48 image stack, with an addi- 
tional binary feature plane describing the current colour to play. Hidden layers 2 to 
11 are identical to the policy network, hidden layer 12 is an additional convolution 
layer, hidden layer 13 convolves 1 filter of kernel size 1 x 1 with stride 1, and hidden 
layer 14 is a fully connected linear layer with 256 rectifier units. The output layer 
is a fully connected linear layer with a single tanh unit. 

Evaluation. We evaluated the relative strength of computer Go programs by run- 
ning an internal tournament and measuring the Elo rating of each program. We 
estimate the probability that program a will beat program b by a logistic function 
p(a beats b) = , and estimate the ratings e(-) by Bayesian 


. The value network was trained for 50 million 


1+ ae (b) — e(a) y? 
logistic regression, computed by the BayesElo program*’ using the standard 
constant Ceo = 1/400. The scale was anchored to the BayesElo rating of professional 
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Go player Fan Hui (2,908 at date of submission)”. All programs received a maxi- 
mum of 5s computation time per move; games were scored using Chinese rules 
with a komi of 7.5 points (extra points to compensate white for playing second). 
We also played handicap games where AlphaGo played white against existing Go 
programs; for these games we used a non-standard handicap system in which komi 
was retained but black was given additional stones on the usual handicap points. 
Using these rules, a handicap of K stones is equivalent to giving K — 1 free moves 
to black, rather than K — 1/2 free moves using standard no-komi handicap rules. 
We used these handicap rules because AlphaGo’s value network was trained spe- 
cifically to use a komi of 7.5. 

With the exception of distributed AlphaGo, each computer Go program was 
executed on its own single machine, with identical specifications, using the latest 
available version and the best hardware configuration supported by that program 
(see Extended Data Table 6). In Fig. 4, approximate ranks of computer programs 
are based on the highest KGS rank achieved by that program; however, the KGS 
version may differ from the publicly available version. 

The match against Fan Hui was arbitrated by an impartial referee. Five 
formal games and five informal games were played with 7.5 komi, no handi- 
cap, and Chinese rules. AlphaGo won these games 5-0 and 3-2 respectively 
(Fig. 6 and Extended Data Table 1). Time controls for formal games were 1h main 
time plus three periods of 30s byoyomi. Time controls for informal games were 
three periods of 30s byoyomi. Time controls and playing conditions were chosen 
by Fan Hui in advance of the match; it was also agreed that the overall match 
outcome would be determined solely by the formal games. To approximately 
assess the relative rating of Fan Hui to computer Go programs, we appended the 
results of all ten games to our internal tournament results, ignoring differences 
in time controls. 
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Extended Data Table 1 | Details of match between AlphaGo and Fan Hui 


Date Black White Category Result 

53/10/15. Fan Hui AlphaGo Formal AlphaGo wins by 2.5 points 
5/10/15. Fan Hui AlphaGo Informal Fan Hui wins by resignation 
6/10/15 AlphaGo FanHui Formal AlphaGo wins by resignation 
6/10/15 AlphaGo FanHui- Informal AlphaGo wins by resignation 
TAO/15  FanHui AlphaGo Formal AlphaGo wins by resignation 
TAO/15  FanHui AlphaGo Informal AlphaGo wins by resignation 
8/10/15. AlphaGo FanHui Formal AlphaGo wins by resignation 
8/10/15. AlphaGo FanHui Informal AlphaGo wins by resignation 
9/10/15. Fan Hui AlphaGo Formal AlphaGo wins by resignation 
9/10/15. AlphaGo FanHui Informal Fan Hui wins by resignation 


The match consisted of five formal games with longer time controls, and five informal games with shorter time controls. 
Time controls and playing conditions were chosen by Fan Hui in advance of the match. 
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Extended Data Table 2 | Input features for neural networks 


Feature # of planes Description 

Stone colour 3 Player stone / opponent stone / empty 

Ones 1 Aconstant plane filled with 1 

Turns since 8 How many turns since a move was played 

Liberties 8 Number of liberties (empty adjacent points) 

Capture size 8 How many opponent stones would be captured 

Self-atari size 8 How many of own stones would be captured 

Liberties after move 8 Number of liberties after this move is played 

Ladder capture 1 Whether a move at this point is a successful ladder capture 
Ladder escape 1 Whether a move at this point is a successful ladder escape 
Sensibleness 1. Whether a move is legal and does not fill its own eyes 
Zeros 1 Aconstant plane filled with 0 

Player color 1 Whether current player is black 


Feature planes used by the policy network (all but last feature) and value network (all features). 
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Extended Data Table 3 | Supervised learning results for the policy network 


Architecture Evaluation 
Filters Symmetries Features Test accu- Trainaccu- Raw _ net AlphaGo Forward 
racy % racy % wins % wins % time (ms) 

128 1 48 54.6 57.0 36 53 2.8 

192 1 48 55.4 58.0 50 50 4.8 

256 1 48 55.9 59.1 67 55 71 

256 2 48 56.5 59.8 67 38 13.9 
256 4 48 56.9 60.2 69 14 27.6 
256 8 48 57.0 60.4 69 5 55.3 

192 1 4 47.6 51.4 25 15 4.8 

192 1 12 54.7 57.1 30 34 4.8 

192 1 20 54.7 57.2 38 40 4.8 

192 8 4 49.2 53.2 24 2 36.8 

192 8 12 55.7 58.3 32 3 36.8 

192 8 20 55.8 58.4 42 3 36.8 


The policy network architecture consists of 128, 192 or 256 filters in convolutional layers; an explicit symmetry ensemble over 2, 4 or 8 symmetries; using only the first 4, 12 or 
20 input feature planes listed in Extended Data Table 1. The results consist of the test and train accuracy on the KGS data set; and the percentage of games won by given policy 
network against AlphaGo’s policy network (highlighted row 2): using the policy networks to select moves directly (raw wins); or using AlphaGo’s search to select moves (AlphaGo 
wins); and finally the computation time for a single evaluation of the policy network. 
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Extended Data Table 4 | Input features for rollout and tree policy 


Feature #of patterns Description 

Response 1 Whether move matches one or more response pattern features 
Save atari 1 Move saves stone(s) from capture 

Neighbour 8 Move is 8-connected to previous move 

Nakade 8192 Move matches a nakade pattern at captured stone 

Response pattern 32207 Move matches 12-point diamond pattern near previous move 
Non-response pattern 69338 Move matches 3 x 3 pattern around move 

Self-atari 1 Move allows stones to be captured 

Last move distance 34 Manhattan distance to previous two moves 

Non-response pattern 32207 Move matches 12-point diamond pattern centred around move 


Features used by the rollout policy (first set) and tree policy (first and second set). Patterns are based on stone colour (black/white/empty) and liberties (1, 2, >3) 
at each intersection of the pattern. 
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Extended Data Table 5 | Parameters used by AlphaGo 


Symbol Parameter Value 
B Softmax temperature 0.67 
mr Mixing parameter 0.5 
Ny] Virtual loss 3 
Nehr Expansion threshold 40 
Cpuct Exploration constant 5 
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Extended Data Table 6 | Results of a tournament between different Go programs 


Short name Computer Player Version Time settings CPUs GPUs’ KGS Rank Elo 
os Distributed AlphaGo See Methods 5 seconds 1202 176 — 3140 
Qryp AlphaGo See Methods 5 seconds 48 8 — 2890 
CS CrazyStone 2015 5 seconds 32 - 6d 1929 
ZN Zen 5 5 seconds 8 - 6d =1888 
PC Pachi 10.99 400,000 sims 16 - 2d 1298 
FG Fuego svn1989 100,000 sims 16 - — 1148 
GG GnuGo 3.8 level 10 1 - 5k 431 
CS, CrazyStone 4handicap stones 5 seconds 32 - — 2526 
ZN, Zen 4handicap stones 5 seconds 8 - — 2413 
PC, Pachi 4handicap stones 400,000 sims 16 - - 1756 


Each program played with a maximum of 5 s thinking time per move; the games against Fan Hui were conducted using longer time controls, as described in Methods. CNa, ZNa 
and PCa were given 4 handicap stones; komi was 7.5 in all games. Elo ratings were computed by BayesElo. 
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Extended Data Table 7 | Results of a tournament between different variants of AlphaGo 


Short Policy Value Rollouts Mixing Policy Value Elo 
name network network constant GPUs GPUs rating 
Arup Do v6 Dr A= 0.5 2 6 2890 
Qyp Do vO - A=0 2 6 2177 
Arp Do - Dr A=1 8 0 2416 
Ory [pr] v6 Dr A= 0.5 0 8 2077 
Qy [p,| 7) - A=0 0 8 1655 
Oy [pr] - Dr A=1 0 0 1457 
Ap Do - - - 0 0 1517 


Evaluating positions using rollouts only (ap, a), value nets only (ayp, ay), or mixing both (ayyp, ay); either using the policy network p,(a;yp, Avp, Arp), OF NO policy 
network (arp, Avp, Arp), that is, instead using the placeholder probabilities from the tree policy p, throughout. Each program used 5 s per move ona single machine 
with 48 CPUs and 8 GPUs. Elo ratings were computed by BayesElo. 
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Extended Data Table 8 | Results of a tournament between AlphaGo and distributed AlphaGo, testing scalability 
with hardware 


AlphaGo Search threads CPUs GPUs Elo 

Asynchronous 1 48 8 2203 
Asynchronous 2 48 8 2393 
Asynchronous 4 48 8 2564 
Asynchronous 8 48 8 2665 
Asynchronous 16 48 8 2778 
Asynchronous 32 48 8 2867 
Asynchronous 40 48 8 2890 
Asynchronous 40 48 1 2181 
Asynchronous 40 48 2 2738 
Asynchronous 40 48 4 2850 
Distributed 12 428 64 2937 
Distributed 24 764 112 3079 
Distributed 40 1202 176 3140 


Distributed 64 1920 280 3168 


Each program played with a maximum of 2s thinking time per move. Elo ratings were computed by BayesElo. 
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Extended Data Table 9 | Cross-table of win rates in per cent between programs 


ARTICLE 


Qryp Qyp Arp Ary Or Oy Ap 

Oryp = 1 [0; 5] 5 [4:7] 0 [0; 4] 0 [0: 8] 0 [0; 19 0 [0; 19] 
Qyp 99 [95; 100] - 61 [52; 69] 35 [25; 48] 6 [1; 27] 0 [0; 22 1 [0; 6] 
App 95 [93; 96] 39 [31; 48} - 13 [7; 23] 0 [0; 9] 0 [0; 22 4 [1; 21) 
Ary 100 [96; 100 65 [52:75] 87 [77; 93] - 0 [0; 18] 29 [8; 64 48 [33; 65] 
ar 100 [92: 100 94 [73; 99] 100 (91: 100} ~— 100 [82: 100] - 78 [45; 94] 78 [71; 84] 
Ay 100 [81; 100] 100 [78;100] 100[78:100] 71 [36; 92 22 [6; 55] - 30 [16; 48] 
Qp 100 [81; 100 99 [94; 100] 96 [79; 99] 52 [35:67] 22 [16; 29] 70 [52; 84 - 

CS 100 [97; 100 74 [66; 81] 98 [94; 99] 80 (70; 87 5 [3:7] 36 [16; 61 8 [5; 14] 
ZN 99 [93; 100 84 [67; 93] 98 [93; 99] 92 [67; 99 6 [2; 19] 40 [12:77 100 [65; 100] 
PC 100 [98: 100 99 [95; 100] 100 [98; 100] 98 [89; 100] 78 [73; 81] 87 [68; 95 55 [47; 62] 
FG 100 [97; 100 99 [93; 100] 100 (96; 100] 100 [91;100] 78 [73;83] 100 [65;100] 65 [55; 73] 
GG 100 [44; 100] 100 [34; 100] 100 [68;100] 100 [57;100] 99 [97;100] 67 [21; 94] 99 [95; 100] 
CS, 77 (69; 84] 12 [8; 18] 53 [44; 61] 15 [8; 24] 0 [0: 3] 0 [0: 30] 0 [0: 8] 
ZNa 86 [77: 92] 25 [16; 38] 67 [56; 76] 14 [7; 27] 0 [0; 12] 0 [0; 43] - 

PC, 99 (97; 100] 82 [75; 88] 98 [95:99] 89 [79:95] 32 [26; 39] 13 [3; 36] 35 [25; 46] 


95% Agresti-Coull confidence intervals in grey. Each program played with a maximum of 5s thinking time per move. CNa, ZNa and PC, were given 4 handicap stones; 
komi was 7.5 in all games. Distributed AlphaGo scored 77% [70; 82] against a,,) and 100% against all other programs (no handicap games were played). 
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Extended Data Table 10 | Cross-table of win rates in per cent between programs in the single-machine scalability study 


Threads 1 2 4 8 16 32 40 40 40 40 
GPU 8 8 8 8 8 8 8 4 2 1 

1 8 - 70 [61;78] 90 [84:94] 94 [83:98] 86 [72:94] 98 [91;100] 98 [92;99] 100 [76;100] 96 [91;98] 38 [25:52] 
2 8 30 [22:39] - 72 [61:81] 81 [71:88] 86 [76;93] 92 [83:97] 93 [86;96] 83 [69:91] 84 [75;90] 26 [17:38] 
4 8 |10 [6:16] 28 [19:39] - 62 [53:70] 71 [61;80] 82 [71:89] 84 [74:90] 81 [69:89] 78 [63:88] 18 [10;28] 
8 8 | 6 [2:17] 19 [12:29] 38 [30:47] - 61 [51:71] 65 [51;76] 73 [62;82] 74 [59;85] 64 [55;73] 12 [3:34 
16 8 |14 [6:28] 14 [7:24] 29 [20:39] 39 [29:49] - 52 [41:63] 61 [50;71] 52 [41:64] 41 [32:51] 5 [1:25 
32 8 | 2 [0:9] 8 [3:17] 18 [11:29] 35 [24:49] 48 [37:59 - 52 [42:63] 44 [32:57] 26 [17:36] 0 [0;30 
AO 8 | 2[1:8] 8 [4:14] 16 [10;26] 27 [18:38] 39 [29:50] 48 [37:58] - 43 [30:56] 41 [26;58] 4 [1;18 
AO A | 0[0;24] 17 [9:31] 19 [11;31] 26 [15:41] 48 [36:59] 56 [43:68] 57 [44:70] = - 29 [18:41] 2 [0:11 
AO 2 | 4[2;9] 16 [10;25] 22 [12;37] 36 [27:45] 59 [49:68] 74 [64:83] 59 [42:74] 71 [59;82]  - 3 [1:17 
AO 1 |62 [48:75] 74 [62:83] 82 [72:90] 88 [66;97] 95 [75;99] 100 [70;100] 96 [82;99] 98 [89;100] 95 [83:99] - 


95% Agresti-Coull confidence intervals in grey. Each program played with 2s per move; komi was 7.5 in all games. 
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Extended Data Table 11 | Cross-table of win rates in per cent between programs 


in the distributed scalability study 


Threads AO 12 24 AO 64 
GPU 8 64 112 176 280 
CPU 48 428 764 1202 1920 
40 8 48 | - 52 [43; 61] 68 [59; 76] 77 [70; 82] 81 [65; 91 
12 64 428 |48 [39:57] - 64 [54; 73] 62 [41; 79] 83 [55; 95 
24 112 764 |32 [24:41] 36 [27:46] - 36 [20; 57] 60 [51; 69 
40 176 1202|23 [18; 30] 38 [21:59] 64 [43; 80] - 53 [39; 67 
64 280 1920)19 [9:35] 17 [5:45] 40 [31:49] 47 [33:61] - 


95% Agresti-Coull confidence intervals in grey. Each program played with 2s per move; komi was 7.5 in all games. 
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High-fidelity CRISPR-Cas9 nucleases 
with no detectable genome-wide 


off-target effects 


Benjamin P. Kleinstiver!*, Vikram Pattanayak!?*, Michelle S. Prew!, Shengdar Q. Tsai!?, Nhu T. Nguyen!, 


Zongli Zheng? & J. Keith Joung!? 


CRISPR-Cas9 nucleases are widely used for genome editing but can induce unwanted off-target mutations. Existing 
strategies for reducing genome-wide off-target effects of the widely used Streptococcus pyogenes Cas9 (SpCas9) are 
imperfect, possessing only partial or unproven efficacies and other limitations that constrain their use. Here we describe 
SpCas9-HFI, a high-fidelity variant harbouring alterations designed to reduce non-specific DNA contacts. SpCas9-HF1 
retains on-target activities comparable to wild-type SpCas9 with >85% of single- guide RNAs (sgRNAs) tested in human 
cells. Notably, with sgRNAs targeted to standard non-repetitive sequences, SpCas9-HF1 rendered all or nearly all off- 
target events undetectable by genome-wide break capture and targeted sequencing methods. Even for atypical, repetitive 
target sites, the vast majority of off-target mutations induced by wild-type SpCas9 were not detected with SpCas9-HF1. 
With its exceptional precision, SpCas9-HF1 provides an alternative to wild-type SpCas9 for research and therapeutic 
applications. More broadly, our results suggest a general strategy for optimizing genome-wide specificities of other 


CRISPR-RNA- guided nucleases. 


CRISPR-Cas9 nucleases enable highly efficient genome editing in a 
wide variety of organisms’~’, but can also cause unwanted mutations 
at off-target sites that resemble the on-target sequence* !°. These off- 
target effects can confound research experiments and also have potential 
implications for therapeutic uses of the technology. Various strategies 
have been described to reduce genome-wide off-target mutations of 
the commonly used SpCas9 nuclease, including: truncated ssRNAs 
bearing shortened regions of target site complementarity®!*, SpCas9 
mutants such as the recently described D1135E variant", paired SpCas9 
nickases'®!”, and dimeric fusions of catalytically inactive SpCas9 to a 
non-specific FokI nuclease!**°. However, these approaches are only 
partially effective, have as-yet unproven efficacies on a genome-wide 
scale, and/or possess the potential to create more new off-target sites. 
Furthermore, some require expression of multiple sgRNAs and/or 
fusion of additional functional domains to Cas9, which can reduce 
targeting range and create challenges for delivery with viral vectors that 
have limits on nucleic acid payload size. Thus, a major challenge for the 
field remains the development of a robust and easily used strategy that 
eliminates off-target mutations on a genome-wide scale. 

We initially hypothesized that off-target effects of SpCas9 might 
be minimized by decreasing non-specific interactions with its target 
DNA site. SpCas9-sgRNA complexes cleave target sites composed of 
an NGG protospacer adjacent motif (PAM) sequence (recognized by 
SpCas9)?!-*4 and an adjacent 20 base pair (bp) protospacer sequence 
(which is complementary to the 5’ end of the sgRNA)*”>-?”. We pre- 
viously proposed that the SpCas9-sgRNA complex might possess 
more energy than is needed for optimal recognition of its intended 
target DNA site, thereby enabling cleavage of mismatched off-target 
sites'*. Structural studies have suggested that the SpCas9-sgRNA- 
target DNA complex encompasses several SpCas9-mediated DNA 
contacts, including direct hydrogen bonds made by four SpCas9 


residues (N497, R661, Q695, Q926) to the phosphate backbone of 
the target DNA strand”*”? (Fig. 1a and Extended Data Fig. la, b). We 
envisioned that disruption of one or more of these contacts might alter 
the energetics of the SpCas9-sgRNA complex so that it might retain 
enough for robust on-target activity but have a diminished ability to 
cleave mismatched off-target sites. 


Alteration of SpCas9 DNA contacts 

Guided by this excess energy hypothesis, we first constructed 15 dif- 
ferent SpCas9 variants bearing all possible single, double, triple, and 
quadruple combinations of N497A, R661A, Q695A, and Q926A sub- 
stitutions to test whether contacts made by these residues might be 
dispensable for on-target activity (Fig. 1b). For these experiments, we 
used a previously described human cell-based enhanced GFP (EGFP) 
disruption assay*’. Using an EGFP-targeted sgRNA, which we have 
previously shown can efficiently induce insertion or deletion mutations 
(indels) in an EGFP reporter gene when paired with wild-type SpCas9 
(ref. 4), we found that all 15 SpCas9 variants possessed activities com- 
parable to that of wild-type SpCas9 (Fig. 1b, grey bars). Thus, alanine 
substitution of one or all of these residues did not reduce on-target 
cleavage efficiency of SpCas9 with this EGFP-targeted sgRNA. 

Next, we sought to assess the relative activities of all 15 SpCas9 
variants at mismatched target sites. To do this, we repeated the EGFP 
disruption assay with derivatives of the EGFP-targeted sgRNA used 
in the previous experiment that contain pairs of substituted bases at 
positions ranging from 13 to 19 (numbering starting with 1 for the 
most PAM-proximal base and ending with 20 for the most PAM-distal 
base; Fig. 1b). This analysis revealed that one of the triply substituted 
variants (R661A/Q695A/Q926A) and the quadruple substitution vari- 
ant (N497 A/R661A/Q695A/Q926A) both showed minimal EGFP dis- 
ruption at or near background levels with all four of the mismatched 
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Figure 1 | Identification and characterization of SpCas9 variants 
bearing substitutions in residues that form non-specific DNA contacts. 
a, Schematic depicting wild-type SpCas9 interactions with the target 
DNA-sgRNA duplex, based on PDB accession 4008 and 4UN3 (adapted 
from refs 28 and 29, respectively). b, Characterization of SpCas9 variants 
that contain alanine substitutions in positions that form hydrogen bonds 
with the DNA backbone. Wild-type SpCas9 and variants were assessed 
using the human cell EGFP disruption assay when programmed with a 


sgRNAs (Fig. 1b, coloured bars). Based on these results, we chose the 
quadruple substitution variant (hereafter referred to as SpCas9-HF1 
for high-fidelity variant number 1) for further analysis. 


SpCas9-HF! retains high on-target activities 

To determine how robustly SpCas9-HF1 functions at a larger number of 
on-target sites, we performed direct comparisons between this variant 
and wild-type SpCas9 using additional sgRNAs. In total, we tested 37 
different sgRNAs, 24 targeted to EGFP and 13 targeted to endogenous 
human gene targets. For 20 of the 24 sgRNAs tested using the EGFP 
disruption assay (Extended Data Fig. 2a) and 12 of the 13 sgRNAs 
tested using a T7 endonuclease I mismatch assay (Fig. 1c), we found 
SpCas9-HF1 exhibited at least 70% of the on-target activities observed 
with wild-type SpCas9 at the same sites (Fig. 1d). Indeed, SpCas9-HF1 
showed highly comparable activities (90-140%) to wild-type SpCas9 
with the vast majority of sgRNAs (Fig. 1d). Three of the 37 sgRNAs 
tested showed essentially no activity with SpCas9-HF1 (EGFP sites 9 
and 23, and RUNX1 site 2), and examination of these target sites did not 
suggest any obvious differences in the characteristics of these sequences 
compared to those for which we saw high activities (Supplementary 
Table 1). Overall, SpCas9-HF1 possesses comparable activities (greater 
than 70% of wild-type SpCas9 activities) for 86% (32/37) of the ssRNAs 
we tested. 


Genome-wide specificity of SpCas9-HF1 

To test whether SpCas9-HF1 exhibits reduced off-target effects in 
human cells, we used the genome-wide unbiased identification of dou- 
ble-stranded breaks enabled by sequencing (GUIDE-seq) method? to 
assess eight different sgRNAs targeted to sites in the endogenous human 
EMX1, FANCF, RUNX1, and ZSCAN2 genes. The sequences targeted 
by these sgRNAs have variable numbers of predicted mismatched sites 
in the reference human genome (Extended Data Table 1). Assessment 


perfectly matched sgRNA or partially mismatched sgRNAs. Error bars 
represent s.e.m. for n = 3; mean level of background EGFP loss represented 
by red dashed line. c, On-target activities of wild-type SpCas9 and 
SpCas9-HF1 across 13 endogenous sites measured by T7 endonuclease I 
assay. Error bars represent s.e.m. for n = 3. d, Ratio of on-target activity 

of SpCas9-HF1 to wild-type SpCas9. The median and interquartile range 
are shown; the interval with >70% of wild-type activity is highlighted in 
green. 


of on-target double-stranded oligodeoxynucleotide (dsODN) tag inte- 
gration (by restriction-fragment length polymorphism (RFLP) assay) 
and indel formation (by T7 endonuclease I assay) for the eight sgRNAs 
revealed comparable on-target activities with wild-type SpCas9 and 
SpCas9-HF1 (Extended Data Fig. 3a and 3b, respectively), demonstrat- 
ing that these GUIDE-seq experiments were working efficiently and 
comparably with the two different nucleases. 

These GUIDE-seq experiments showed that with wild-type SpCas9, 
seven of the eight sgRNAs induced cleavage at multiple off-target sites 
(ranging from 2 to 25 per sgRNA), whereas the eighth ssRNA (FANCF 
site 4) did not yield any detectable off-target sites (Fig. 2a, b). The 
off-target sites identified harboured one to six mismatches distributed 
throughout various positions in the protospacer and/or PAM sequence 
(Fig. 2c and Extended Data Fig. 4a). However, with SpCas9-HF1, 
a complete absence of GUIDE-seq detectable off-target events was 
observed for six of the seven sgRNAs that induced off-target effects 
with wild-type SpCas9 (Fig. 2a, b). Among these seven sgRNAs, 
only a single detectable genome-wide off-target was identified, for 
FANCE site 2, at a site harbouring one mismatch within the proto- 
spacer seed sequence (Fig. 2a). As with wild-type SpCas9, the eighth 
sgRNA (FANCE site 4) did not yield any detectable off-target cleavage 
events when tested with SpCas9-HF1 (Fig. 2a). Notably, with all eight 
sgRNAs, SpCas9-HF1 did not create any new nuclease-induced off- 
target sites (not already observed with wild-type SpCas9) detectable 
by GUIDE-seq. 

To confirm these GUIDE-seq findings, we used targeted ampli- 
con sequencing to more directly measure the frequencies of indel 
mutations induced by wild-type SpCas9 and SpCas9-HF1. For these 
experiments, we transfected human cells only with sgRNA- and Cas9- 
encoding plasmids (without the GUIDE-seq tag). We used next- 
generation sequencing to examine the on-target sites and 36 of the 40 
off-target sites that had been identified for six sgRNAs with wild-type 
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Figure 2 | Genome-wide specificities of wild-type SpCas9 and 
SpCas9-HF1 with sgRNAs targeted to standard, non-repetitive sites. 

a, Off-target cleavage sites of wild-type SpCas9 and SpCas9-HF1 with 
eight sgRNAs targeted to endogenous human genes, as determined by 
GUIDE-seq. Read counts represent a measure of cleavage frequency at a 
given site; mismatched positions within the spacer or PAM are highlighted 


SpCas9 in our GUIDE-seq experiments (four of the 40 sites could not 
be specifically amplified from genomic DNA). These deep sequenc- 
ing experiments showed that: (1) wild-type SpCas9 and SpCas9-HF 1 
induced comparable frequencies of indels at each of the six sgRNA 
on-target sites, indicating that the nucleases and sgRNAs were func- 
tional in all experimental replicates (Fig. 3a, b); (2) as expected, wild- 
type SpCas9 showed statistically significant evidence of indel mutations 
at 35 of the 36 off-target sites (Fig. 3b) at frequencies that correlated well 
with GUIDE-seq read counts for these same sites (Fig. 3c); and (3) the 
frequencies of indels induced by SpCas9-HF1 at 34 of the 36 off-target 
sites were statistically indistinguishable from the background level of 
indels observed in samples from control transfections (Fig. 3b). For the 
two off-target sites that appeared to have statistically significant muta- 
tion frequencies with SpCas9-HF1 relative to the negative control, the 
mean frequencies of indels were 0.049% and 0.037%, levels at which it is 
difficult to determine whether these are due to sequencing or PCR error 
or are bona fide nuclease-induced indels. Based on these results, we 
conclude that SpCas9-HF1 can completely or nearly completely reduce 
off-target mutations that occur across a range of different frequencies 
with wild-type SpCas9 to levels generally undetectable by GUIDE-seq 
and targeted deep sequencing. 
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in colour. b, Summary of the total number of genome-wide off-target sites 
identified by GUIDE-seq for wild-type SpCas9 and SpCas9-HF1 with 

the sgRNAs used in panel a. c, Off-target sites identified for wild-type 
SpCas9 and SpCas9-HF1 for the eight ssRNAs, binned according to the 
total number of mismatches (in the protospacer and PAM) relative to 

the on-target site. 


We next assessed the capability of SpCas9-HF1 to reduce genome- 
wide off-target effects of sgRNAs designed against atypical homopol- 
ymeric or repetitive sequences. Although we and other researchers 
now try to avoid on-target sites with these characteristics due to their 
relative lack of orthogonality to the genome, we wished to challenge 
the genome-wide specificity of SpCas9-HF1 with sites that have very 
large numbers of known off-target sites in human cells. Therefore, 
we used previously characterized sgRNAs** that target either a cyto- 
sine-rich homopolymeric sequence or a sequence containing multiple 
TG repeats in the human VEGFA gene (VEGFA site 2 and VEGFA site 
3, respectively) (Extended Data Table 1). In control experiments, we 
again found that each of these sgRNAs induced comparable levels of 
GUIDE-seq dsODN tag incorporation (Extended Data Fig. 3c) and 
indel mutations (Extended Data Fig. 3d) with both wild-type SpCas9 
and SpCas9-HF1, demonstrating that SpCas9-HF 1 is not impaired 
in on-target activity with either of these sgRNAs. Importantly, these 
GUIDE-seq experiments revealed that SpCas9-HF1 was highly effec- 
tive at reducing off-target sites of these sgRNAs, with 123/144 sites for 
VEGFA site 2 and 31/32 sites for VEGFA site 3 not detected (Fig. 4a 
and Extended Data Fig. 5). Examination of wild-type SpCas9 off-target 
sites not detected with SpCas9-HF1 showed that they each possessed a 
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Figure 3 | Validation of SpCas9-HF1 specificity improvements by deep 
sequencing of off-target sites identified by GUIDE-seq. a, Mean 
on-target per cent modification for wild-type SpCas9 and SpCas9-HF1 
with six sgRNAs from Fig. 2. Error bars represent s.e.m. for n= 3. b, Per 
cent modification of on-target and GUIDE-seq detected off-target sites 
with indel mutations. Triplicate experiments are plotted for wild-type 
SpCas9, SpCas9-HF1, and a negative control; off-target sites are numbered 
as indicated in Fig. 2a. Filled circles below the x axis represent replicates 
for which no indels were observed (Supplementary Table 4). Hypothesis 
testing using a one-sided Fisher exact test with pooled read counts found 


range of total mismatches distributed at various positions within their 
protospacer and PAM sequences: 2 to 7 mismatches for the VEGFA site 
2 sgRNA and 1 to 4 mismatches for the VEGFA site 3 sgRNA (Fig. 4b 
and Extended Data Fig. 4b); also, nine of these off-targets for VEGFA 
site 2 may be recognized by an alternate potential base pairing inter- 
action with the sgRNA that might occur with a single bulged base!” at 
the sgsRNA-DNA interface (Extended Data Figs 5 and 6). Overall, the 
sites that were still mutated by SpCas9-HF 1 possessed a range of 2 to 
6 mismatches for the VEGFA site 2 sgRNA and 2 mismatches in the 
single site for the VEGFA site 3 sgRNA (Fig. 4b), with three of the off- 
target sites for the VEGFA site 2 sgRNA having an alternative potential 
single bulge alignment (Extended Data Figs 5 and 6). Notably, no new 
nuclease-induced off-target sites were induced by SpCas9-HF1 with 
either of the two sgRNAs. Collectively, these results demonstrate that 
SpCas9-HF1 can be highly effective at reducing off-target effects of 
sgRNAs targeted to simple repeat sequences and can also have substan- 
tial impacts on sgRNAs targeted to homopolymeric sequences. 


Refining the specificity of SpCas9-HF1 

Previously described methods such as truncated sgRNAs’* and the 
SpCas9 D1135E variant! can partially reduce SpCas9 off-target effects, 
and we therefore wondered whether these might be combined with 
SpCas9-HF1 to further improve its genome-wide specificity. Testing of 
SpCas9-HF1 with matched full-length and truncated sgRNAs targeted 
to four sites in the human cell-based EGFP disruption assay revealed 
that shortening ssRNA complementarity length substantially impaired 
on-target activities (Extended Data Fig. 7a). By contrast, SpCas9-HF1 
with an additional D1135E substitution (a variant we call SpCas9-HF2) 
retained 70% or more activity of wild-type SpCas9 with six of eight 
sgRNAs tested using our human cell-based EGFP disruption assay 
(Fig. 5a and Extended Data Fig. 2b). We also constructed SpCas9-HF3 
and SpCas9-HF4 variants harbouring additional L169A or Y450A sub- 
stitutions, respectively, at positions whose side chains are believed to 
mediate non-specific hydrophobic interactions with the target DNA 


significant differences (P< 0.05 after adjusting for multiple comparisons 
using the Benjamini-Hochberg method) for comparisons between 
SpCas9-HF1 and the control condition only at EMX1 site 2 off-target 1 
and FANCF site 3 off-target 1. Significant differences were also found 
between wild-type SpCas9 and SpCas9-HF1 at all off-target sites, and 
between wild-type SpCas9 and the control condition at all off-target sites 
except RUNX1 site 1 off-target 2. c, Scatter plot of the correlation between 
GUIDE-seq read counts (from Fig. 2a) and mean per cent modification 
determined by deep sequencing at on- and off-target cleavage sites with 
wild-type SpCas9. 


on its PAM proximal end?®*! (Fig. 1a). The Y450 residue is notable 
for participating in a base stacking interaction with the ssRNA*! and 
undergoing a 120 degree shift upon target binding to create its hydro- 
phobic interaction with the DNA***. SpCas9-HF3 and SpCas9-HF4 
retained 70% or more of the activities observed with wild-type SpCas9 
with the same six out of eight EGFP-targeted sgRNAs (Fig. 5a and 
Extended Data Fig. 2b). 

We next sought to determine whether SpCas9-HF2, -HF3, or -HF4 
could reduce indel frequencies at two off-target sites that remained 
susceptible to modification by SpCas9-HF1, one with the FANCF site 
2 sgRNA and another with the VEGFA site 3 sgRNA. For the FANCF 
site 2 off-target, which bears a single mismatch in the seed sequence of 
the protospacer, we found that SpCas9-HF4 (containing the additional 
Y450A substitution) reduced indel mutation frequencies to near back- 
ground level as judged by T7 endonuclease I assay, while also beneficially 
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Figure 4 | Genome-wide specificities of wild-type SpCas9 and 
SpCas9-HF1 with sgRNAs targeted to non-standard, repetitive sites. 
a, Summary of the total number of genome-wide off-target cleavage sites 
identified by GUIDE-seq for wild-type SpCas9 and SpCas9-HF1 with 
sgRNAs targeted to VEGFA sites 2 and 3. b, Off-target sites identified for 
wild-type SpCas9 or SpCas9-HF1 with sgRNAs targeted to VEGFA sites 
2 and 3 binned according to the total number of mismatches (within the 
protospacer and PAM) relative to the on-target site. 
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Figure 5 | Activities of high-fidelity derivatives of SpCas9-HF1 
bearing additional substitutions. a, Summary of the on-target 

EGFP disruption activities of various SpCas9-HF variants compared to 
wild-type SpCas9 (from the data in Extended Data Fig. 2b). SpCas9-HF1 
contains N497A, R661A, Q695A, and Q926A substitutions; HF2 = HF1 
+ D1135E; HF3 = HF1 + L169A; HF4=HF1 + Y450A. The median and 
interquartile range are shown; the interval showing >70% of wild-type 
activity is highlighted in green. b, Mean per cent modification by SpCas9 


increasing on-target activity (Fig. 5b), resulting in the greatest increase 
in specificity among the three variants (Fig. 5c). For the VEGEA site 3 
off-target site, which bears two protospacer mismatches (one in the seed 
sequence and one at the nucleotide most distal from the PAM sequence), 
SpCas9-HF2 (containing the additional D1135E substitution) showed 
near background levels of indel formation as determined by T7 endo- 
nuclease I assay while showing modest effects on on-target mutation 
efficiency (Fig. 5b), leading to the greatest increase in specificity for this 
off-target site among the three variants tested (Fig. 5c). 


Discussion 

The SpCas9-HF1 variant characterized in this report reduces all or 
nearly all genome-wide off-target effects to undetectable levels as 
judged by GUIDE-seq and targeted next-generation sequencing, 
with the most robust and consistent effects observed with ssRNAs 
designed against standard, non-repetitive target sequences. Our 
observations suggest that off-target mutations might be minimized 
by using SpCas9-HF1 to target non-repetitive sequences that do not 
have closely matched sites (for example, bearing 1 or 2 mismatches) 
elsewhere in the genome; such sites can be easily identified using 
existing publicly available software programs*’. An interesting ques- 
tion will be to determine whether SpCas9-HF1 induces off-target 
mutations at frequencies below the detection limit of existing unbi- 
ased genome-wide methods (Supplementary Discussion). We also 
discuss other practical considerations for targeting sites of interest 
with SpCas9-HF 1, including the use of sgRNAs with non-G or mis- 
matched 5’ nucleotides (Extended Data Fig. 7b) and altering the PAM 
recognition specificity of SpCas9-HF1 (Extended Data Fig. 8), in the 
Supplementary Discussion. 

Further biochemical experiments and structural characterization 
will be required to define the mechanism by which SpCas9-HF1 
achieves its high genome-wide specificity. We do not believe that the 
four substitutions we introduced alter the stability or steady-state 
expression level of SpCas9 in human cells, because titration exper- 
iments with decreasing concentrations of expression plasmids sug- 
gest that wild-type SpCas9 and SpCas9-HF1 behave comparably as 
their amounts are lowered (Extended Data Fig. 9). Although our 
initial rationale for making the substitutions in SpCas9-HF1 was to 
decrease the energetics of interaction between the Cas9-sgRNA com- 
plex and the target DNA (as has been previously proposed to explain 
the increased specificities of transcription activator-like effector nucle- 
ases bearing substitutions at positively charged residues*), recent work 
has provided greater mechanistic insights into SpCas9 recognition and 
cleavage. These studies suggest alternative and more detailed mod- 
els (for example, formation of an active cleavage complex through 
conformational changes or kinetics of off-target site recognition**** 
that might be affected by the substitutions in our SpCas9-HF1 variant 
(Supplementary Discussion)). 
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as well as off-target sites from Fig. 2a and Extended Data Fig. 5 resistant 
to the effects of SpCas9-HF1. Per cent modification determined by T7 
endonuclease I assay; background indel percentages were subtracted for 
all experiments; error bars represent s.e.m. for n =3. ¢, Specificity ratios 
of wild-type SpCas9 and HF variants with the FANCF site 2 or VEGFA 
site 3 sgRNAs, plotted as the ratio of on-target to off-target activity 
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More broadly, our results validate a general strategy for the engineer- 
ing of additional high-fidelity variants of CRISPR-associated nucleases. 
We found that introducing substitutions at other non-specific DNA 
contacting residues can further reduce some of the very small num- 
ber of residual off-target sites that persist for certain sgRNAs with 
SpCas9-HF1. Thus, we envision that variants such as SpCas9-HF2, 
SpCas9-HF4, and others might be used in a customized fashion to 
eliminate any potential off-target sites that might be resistant to the 
specificity improvements of SpCas9-HF1. In addition, our variants 
might be combined with substitutions in residues that contact the non- 
target DNA strand, alterations that have been shown to reduce SpCas9 
off-target effects while our manuscript was under review~”. Overall, our 
results demonstrate that the approach of mutating non-specific DNA 
contacts is highly effective at increasing SpCas9 specificity and suggest 
it might be extended to other naturally occurring and engineered Cas9 
orthologues***”, as well as other CRISPR-associated nucleases**“*, 
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METHODS 


Data reporting. No statistical methods were used to predetermine sample size. 
The investigators were not blinded to allocation during experiments and outcome 
assessment. 

Plasmids and oligonucleotides. DNA sequences of plasmids used in this study 
can be found in the Supplementary Information. sgRNA target sites are available 
in Supplementary Table 1, and oligonucleotides used in this study can be found 
in Supplementary Table 2. SpCas9 expression plasmids containing amino acid 
substitutions were generated by standard PCR and molecular cloning into JDS246 
(ref. 4). sgRNA expression plasmids were constructed by ligating oligonucleotide 
duplexes into BsmBI cut BPK1520 (ref. 15). Unless otherwise indicated, all sgRNAs 
were designed to target sites containing a 5’ guanine nucleotide. 

Human cell culture and transfection. U2OS cells (a gift from Toni Cathomen, 
Freiburg) and U2OS.EGFP cells (containing a single integrated copy of a reporter 
gene encoding an EGFP-PEST fusion)*° were cultured in advanced DMEM sup- 
plemented with 10% heat-inactivated fetal bovine serum, 2mM GlutaMax, and 
penicillin and streptomycin at 37°C with 5% COp. The growth media for U2OS. 
EGEP cells was additionally supplemented with 400 1g ml“ Geneticin. All cell cul- 
ture reagents were obtained from Life Technologies. Cell line identity was validated 
by STR profiling (ATCC) and deep-sequencing, and cells were tested bi-weekly for 
mycoplasma contamination. Unless otherwise noted, cells were co-transfected with 
750 ng of Cas9 plasmid and 250 ng of sgRNA plasmid. For negative control experi- 
ments, Cas9 plasmids were co-transfected with a U6-null plasmid. Nucleofections 
were performed using the DN-100 program on a Lonza 4-D Nucleofector with the 
SE Cell Line Kit according to the manufacturer’s protocol (Lonza). For T7 endo- 
nuclease I assays, GUIDE-seq experiments, and targeted deep sequencing, genomic 
DNA was extracted ~72h post-transfection using the Agencourt DNAdvance 
Genomic DNA Isolation Kit (Beckman Coulter Genomics). 

Human cell EGFP disruption assay. EGFP disruption experiments, in which 
cleavage and induction of indels by non-homologous end-joining (NHE)J)- 
mediated repair within a single integrated EGFP reporter gene leads to loss of cell 
fluorescence, were performed as previously described*”’. Briefly, transfected cells 
were analysed ~52h post-transfection for loss of EGFP expression using a Fortessa 
flow cytometer (BD Biosciences). Background EGFP loss was determined using 
negative control transfections gated at ~2.5% for all experiments (represented 
as a red dashed line in figures). P values for comparisons between SpCas9 vari- 
ants were calculated using a one-sided t-test with equal variances and adjusted for 
multiple comparisons using the method of Benjamini and Hochberg 
(Supplementary Table 3). 

T7 endonuclease I assays. To quantify mutagenesis frequencies at desired genomic 
loci, T7 endonuclease I assays were performed as previously described™. Briefly, 
on- or off-target sites were amplified from ~100 ng of genomic DNA using Phusion 
Hot-Start Flex DNA Polymerase (New England Biolabs) using the primers listed 
in Supplementary Table 2. An Agencourt Ampure XP cleanup (Beckman Coulter 
Genomics) was performed before the denaturation and annealing of ~200 ng of 
the PCR product, followed by digestion with T7 endonuclease I (New England 
Biolabs). Purified digestion products were quantified using a QIAxcel capillary 
electrophoresis instrument (Qiagen) to approximate the mutagenesis frequen- 
cies induced by Cas9-sgRNA complexes. P values for comparisons between 
SpCas9 variants were calculated using a one-sided t-test with equal variances and 
adjusted for multiple comparisons using the method of Benjamini and Hochberg 
(Supplementary Table 3). 

GUIDE-seq. GUIDE-seq relies on the integration of a short dsSODN tag into DNA 
breaks to enable amplification and sequencing of adjacent genomic sequence, 
with the number of tag integrations at any given site providing a quantitative 
measure of cleavage efficiency*. GUIDE-seq experiments were performed and 
analysed essentially as previously described®. Briefly, U2OS cells were transfected 
with 750 ng of Cas9 and 250 ng sgRNA plasmids as described above, along with 
100 pmol of a GUIDE-seq end-protected dsODN that contains an Ndel restric- 
tion site®. Restriction-fragment length polymorphism (RFLP) assays were used 
to estimate GUIDE-seq tag integration frequencies at the intended on-target 
sites as previously described", using the primers listed in Supplementary Table 2. 


The overall on-target mutagenesis frequencies of GUIDE-seq tag-treated samples 
was determined by T7 endonuclease I assay as described above. Tag-specific ampli- 
fication and library preparation® were performed before high-throughput sequenc- 
ing on an Illumina MiSeq instrument. GUIDE-seq data was analysed as previously 
described® using open-source GUIDE-seq analysis software (http://www.jounglab. 
org/guideseq) and the summarized results can be found in Supplementary Table 4. 
Genomic sites were excluded from analysis on the basis of overlap with back- 
ground genomic breakpoint regions detected in any of four oligo-only control 
samples, overlap with previously identified Cas9-sgRNA independent break- 
points in human U20S cells®, or as neighbouring genomic window consolidation 
artefacts likely due to extensive end-resection around breakpoints (Supplementary 
Table 4). Potential RNA- or DNA-bulge sites!” (Extended Data Fig. 6) were iden- 
tified by sequence alignment with Geneious version 8.1.6 (http://www.geneious. 
com)", Sequencing data was corrected for U20S cell-type specific SNPs with the 
site encoding the smallest edit distance to the intended sgRNA site used as the most 
likely off-target (Supplementary Table 4). Differences in number of GUIDE-seq 
identified off-target sites between this work and previous studies*’® are likely due 
to different experimental conditions (for example, different promoters, quantity of 
plasmids used for transfection) and/or to sampling effects at the limit of detection 
of these particular experiments (Supplementary Table 4), and most likely not due 
to depth of sequencing which was similar between experiments. 

Positional profiles generated from GUIDE-seq data (Extended Data Fig. 4) were 
made by weighting each nucleotide at each on/off-target site by the number of 
GUIDE-seq read counts. Sites containing gapped alignments relative to the human 
genome were not considered. Positional profiles for potential genomic off-target 
sites were restricted to sequences containing five or fewer mutations relative to the 
on-target site and to sequences containing NGG PAMs. Heat maps were generated 
with R 3.2.2 and the image function, with colours determined using the function 
colorRampPalette(c(“white’, “blue”))(2500). 

Targeted deep-sequencing. Off-target sites identified by GUIDE-seq were ampli- 
fied using Phusion High-Fidelity DNA polymerase (New England Biolabs) using 
the primers listed in Supplementary Table 2 for the genomic amplicons listed in 
Supplementary Table 5. PCR products were generated for each on- and off-target 
site from ~100 ng of genomic DNA extracted from U20S cells. Products were 
generated from triplicate transfections for each of three experimental conditions: 
(1) control (wild-type SpCas9 + pSL695, a control plasmid that contains a U6 pro- 
moter but does not encode a functional sgRNA), (2) wild-type SpCas9 + sgRNA, 
and (3) SpCas9-HF1 + sgRNA. PCR products were purified with Ampure XP 
magnetic beads (Agencourt), normalized in concentration, and pooled into nine 
samples (individual triplicate experiments for each of the three conditions listed 
above). Illumina Tru-seq compatible deep-sequencing libraries were prepared 
using ~500 ng of each pooled sample using a ‘with-bead’ HTP library prepara- 
tion kit (KAPA BioSystems), and sequenced via 150-bp paired-end sequencing on 
an Illumina MiSeq instrument. High-throughput sequencing data was analysed 
essentially as previously described!*. Breifly, paired reads were mapped to the 
human genome (reference sequence GRChr37) using the bwa mem algorithm with 
default parameters. High-quality reads (average quality score >30) were analysed 
for the presence of two or more bp indels that overlapped to the on- or off-target 
sites (Supplementary Table 5). One bp indel mutations were only included if they 
occurred directly adjacent to the predicted cleavage site. P values for compari- 
sons between control, wild-type SpCas9 + sgRNA, and SpCas9-HF1 + sgRNA 
(Supplementary Table 5) were obtained on pooled triplicate data using a one-sided 
Fisher exact test in the R 3.2.2 software package. P values for each set of compari- 
sons were adjusted for multiple comparisons using the method of Benjamini and 
Hochberg (function p.adjust(method = “BH”) in R). 

Code availability. Scripts for GUIDE-seq analysis (v0.9) can be found at http:// 
jounglab.org/guideseq. The scripts used for indel calling on deep sequencing data 
and GUIDE-seq profiles are available upon request. 


45. Kearse, M. et a/. Geneious Basic: an integrated and extendable desktop 
software platform for the organization and analysis of sequence data. 
Bioinformatics 28, 1647-1649 (2012). 
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pairing between the sgRNA and target DNA. b, Structural representation HNH domain is hidden for visualization purposes. 
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SpCas9-HF1 (a) and SpCas9-HF 1-derivative variants (b) in human cells. represented by the red dashed line. 


SpCas9-HF1 contains N497A, R661A, Q695A, and Q926A substitutions; 
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Extended Data Figure 3 | On-target activity comparisons of 

wild-type and SpCas9-HF1 with various sgRNAs used for GUIDE-seq 
experiments. a, c, Mean GUIDE-seq tag integration at the intended 
on-target site for GUIDE-seq experiments shown in Figs 2a and Extended 
Data Fig. 5 (a and ¢, respectively), quantified by restriction-fragment 


length polymorphism assay. Error bars represent s.e.m. for n= 3. 
b, d, Mean percent modification at the intended on-target site for 
GUIDE-seq experiments shown in Fig. 2a and Extended Data Fig. 5 


(b and d, respectively), detected by T7 endonuclease I assay. Error bars 
represent s.e.m. for n=3. 
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Extended Data Figure 4 | Positional summary of off-target sites 
identified by GUIDE-seq. a, b, Heat maps derived from GUIDE-seq data 
with sgRNAs targeting non-repetitive (a), or repetitive or homopolymeric 
sites (b) in the genome are shown. Base frequencies in the set of all 
potential genomic off-target sites (weighted equally) with NGG PAMs 
and five or fewer mutations for each sgRNA are shown on the left. 
Summaries of off-target sites identified by GUIDE-seq for wild-type 
SpCas9 and SpCas9-HF1 (both weighted by read count) are shown on 


the right. Yellow box outlines denote on-target bases at each position. 
Positions (20-1) are shown below the heat maps, with 1 being the most 
PAM-proximal position. Note the presence of mismatches that would 

be expected to create potential wobble interactions (G—>A or TC) at 
certain positions among the off-target sites induced by wild-type SpCas9 
and that SpCas9-HF1 appears to reduce off-target activity without any 
obvious positional bias. 
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Extended Data Figure 5 | Genome-wide cleavage specificity of the indicated bulge” at the sgRNA-DNA interface, blue circles indicate 
wild-type SpCas9 and SpCas9-HF1 with sgRNAs targeted to sites that may have an alternative gapped alignment relative to the one 
non-standard, repetitive sites. a, GUIDE-seq profiles of wild-type SpCas9 shown (see Extended Data Fig. 6). Off-target sites marked with red circles 
and SpCas9-HF1 using two sgRNAs known to cleave large numbers of are not included in the counts of Fig. 4b, sites marked with blue circles are 
off-target sites*®. GUIDE-seq read counts represent a measure of cleavage counted with the number of mismatches in the non-gapped alignment for 


efficiency at a given site. Mismatched positions within the spacer or PAM Fig. 4b. 
are highlighted in colour red circles indicate off-target sites likely to have 
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Extended Data Figure 6 | Potential alternate alignments for VEGEA site 2 off-target sites. Ten VEGFA site 2 off-target sites identified by 
GUIDE-seq (left) that may potentially be recognized as off-target sites with single nucleotide gaps” (right), aligned using Geneious** version 8.1.6 
(http://www.geneious.com). 
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full-length or truncated sgRNAs. b, EGFP disruption activities of 
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For both panels, error bars represent s.e.m. for n = 3, and the mean level 
of background EGFP loss observed in control experiments is represented 
by the red dashed line. 


© 2016 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


a ew D 100, 
HH SpCas9-VQR = 
3 HH SpCas9-VRQR 3 80: 
3 s = HH SpCas9-VOR 
= = SpCas9-VOR-HF1 
8 © 40 GI SpCas9-VRAR 
3 ft SpCas9-VRQR-HF1 
o ire 
a © 20. 
Ww 
0: 
1 2 1 2 1 2 1 2 
NGAA NGAC NGAT NGAG PAM NGAA NGAC NGAT NGAG PAM 
Endogenous sites in RUNX1 Sites in EGFP 
c 
75: 
ge as 
3 oe 
2 45 HB SpCas9-VOR 7 = A EGFP site 
= SpCas9-VQR-HF1 = we © Endogenous site 
8 30 @® spCas9-VROR IG 
oD fe} 
ce SpCas9-VRQR-HF1 25 
0. IND| IND. 
NGAG NGAT NGAG-1 NGAG-2 NGAT NGAG NGAG NGAG PAM Rom & 
EMX1 FANCF RUNX1 VEGFA ZNF629 * 
Endogenous sites 
Extended Data Figure 8 | Altering the PAM recognition specificity represented by the red dashed line. c, Comparison of the mean on-target 
of SpCas9-HF1. a, Comparison of the mean per cent modification of per cent modification by SpCas9-VQR and SpCas9-VRQR compared to 
on-target endogenous human sites by the SpCas9-VQR variant (ref. 15) their -HF1 variants at eight endogenous human gene sites, quantified by 
and an improved SpCas9-VRQR variant using 8 sgRNAs, quantified by T7 endonuclease I assay. Error bars represent s.e.m. for n= 3; ND, not 
T7 endonuclease I assay. Both variants are engineered to recognize an detectable. d, Summary of the fold-change in on-target activity when using 
NGAN PAM. Error bars represent s.e.m. for n= 3. b, On-target EGFP SpCas9-VQR or SpCas9-VRQR compared to their corresponding -HF1 
disruption activities of SpCas9- VQR and SpCas9-VRQR compared to variants (from b and c). The median and interquartile range are shown, 
their -HF1 counterparts using eight sgRNAs. Error bars represent s.e.m. the interval showing greater than 70% of wild-type activity is highlighted 
for n = 3; mean level of background EGFP loss in negative controls in green. 


© 2016 Macmillan Publishers Limited. All rights reserved 


Site 1 
—e*— Wild-type SpCas9 
—s#— SpCas9-HF1 


Site 7 
--©-- Wild-type SpCas9 
--&-- SpCas9-HF1 


EGFP disruption (%) 


10 100 1000 
ng Cas9 plasmid 


Extended Data Figure 9 | Titrations of wild-type SpCas9 and 
SpCas9-HF1 expression plasmid amounts. Human cell EGFP disruption 
activities from transfections with varying amounts of wild-type and 
SpCas9-HF1 expression plasmids. For all transfections, the amount of 
sgRNA-containing plasmid was fixed at 250 ng. Two sgRNAs targeting 
different sites were used; Error bars represent s.e.m. for n = 3; mean level 
of background EGFP loss in negative controls is represented by the red 
dashed line. 
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Extended Data Table 1 | Summary of potential mismatched sites in the reference human genome for the ten 
sgRNAs examined by GUIDE-seq 


mismatches to on-target site* 


site spacer with PAM “Tl 2 #3 4 5 6 total 
EMX1 site 1 GAGTCCGAGCAGAAGAAGAAGGG 0 1 18 273 2318 15831 18441 
EMX1 site 2 GTCACCTCCAATGACTAGGGTGG 0 0 3 68 780 6102 6953 
FANCF site 1 GGAATCCCTTCTGCAGCACCTGG 0 a 18 288 1475 9611 11393 
FANCF site 2 GCTGCAGAAGGGATTCCATGAGG a8 af 29 235 2000 13047 15313 
FANCF site 3 GGCGGCTGCACAACCAGTGGAGG 0 0 ii 79 874 6651 7615 
FANCF site 4 GCTCCAGAGCCGTGCGAATGGGG 0 0 6 59 639 5078 5782 
RUNX1 site 1 GCATTTTCAGGAGGAAGCGATGG 0 2 6 189 1644 11546 13387 
ZSCAN2 GTGCGGCAAGAGCTTCAGCCGGG 0 3 12 127 1146 10687 11975 
VEGFA site 2 GacccceTecacecceccTcces 0 2 35 456 3905 17576 21974 
VEGFA site 3 GGTGAGTGAGTGTGTGCGTGTGG 1 17 383 6089 13536 35901 55927 


*Determined using Cas-OF Finder (http://www.rgenome.net/cas-offinder/). 
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Dual RNA-seq unveils noncoding RNA 
functions in host-pathogen interactions 


Alexander J. Westermann!, Konrad U. Férstner!, Fabian Amman*“, Lars Barquist!, Yanjie Chao!, Leon N. Schulte!, Lydia Mtiller’, 


Richard Reinhardt®, Peter F. Stadler**+°’ & Jorg Vogel!’ 


Bacteria express many small RNAs for which the regulatory roles in pathogenesis have remained poorly understood due to 
a paucity of robust phenotypes in standard virulence assays. Here we use a generic ‘dual RNA-seq’ approach to profile RNA 
expression simultaneously in pathogen and host during Salmonella enterica serovar Typhimurium infection and reveal 
the molecular impact of bacterial riboregulators. We identify a PhoP-activated small RNA, PinT, which upon bacterial 
internalization temporally controls the expression of both invasion- associated effectors and virulence genes required 
for intracellular survival. This riboregulatory activity causes pervasive changes in coding and noncoding transcripts 
of the host. Interspecies correlation analysis links PinT to host cell JAK-STAT signalling, and we identify infection- 
specific alterations in multiple long noncoding RNAs. Our study provides a paradigm for a sensitive RNA-based analysis 
of intracellular bacterial pathogens and their hosts without physical separation, as well as a new discovery route for 


hidden functions of pathogen genes. 


Regulatory RNAs crucially contribute to post-transcriptional control 
of gene expression in a wide array of organisms, including pathogenic 
bacteria’. The facultative intracellular pathogen Salmonella enterica 
serovar Typhimurium (hereafter referred to as Salmonella) expresses 
hundreds of small regulatory RNAs (sRNAs), many of which are acti- 
vated under defined stress and virulence conditions*""°, suggesting 
a role during host infection. Likewise, genetic inactivation of Hfq, a 
protein required by many sRNAs for target mRNA regulation, attenu- 
ates Salmonella virulence''. However, deletion of sRNA genes typically 
results in mild, if any, phenotypes in animal models!*-'*, probably 
because most sRNAs act to fine-tune gene expression”. Therefore, 
more sensitive approaches are needed to uncover their molecular 
functions during infection. 

RNA-seq provides a sensitive method for global gene expression anal- 
ysis in infection biology!>'®. However, as bacterial infections of eukary- 
otic cells involve two interacting organisms with profoundly different 
transcriptomes, RNA-seq studies are commonly restricted to either the 
pathogen or host after their physical separation!>. Furthermore, they 
typically focus on messenger RNAs (mRNAs) as a proxy for protein 
expression, neglecting the vast RNA output from noncoding regions. 
Theoretically, RNA-seq should permit a simultaneous profiling of all 
RNA classes in both intracellular bacteria and the eukaryotic host, 
despite a striking excess of eukaryotic over bacterial RNA!*'”. Such a 
one-step dual RNA-seq'* analysis separates transcripts in silico, render- 
ing the tedious and error-prone physical separation of pathogen and 
host superfluous. Here, dual RNA-seq has been used to discover how 
Salmonella sRNAs fine-tune gene expression in intracellular bacteria, 
with widespread consequences for the human host response. 


Dual RNA-seq of Salmonella-infected human cells 

We established dual RNA-seq for Salmonella using HeLa cells in which 
bacterial invasion is followed by prolonged intracellular replication and 
a well-defined host response'® with low TLR activity’, which helps to 


reveal host signalling events directly modulated by the pathogen. We 
built on previous work”® combining green fluorescent protein (GFP)- 
expressing Salmonella and fluorescence-activated cell sorting (FACS) 
to select host cells containing on average 10 or 75 bacteria at 4 or 24h 
post-infection (p.i.), respectively, and also non-invaded control cells 
(Fig. la and Extended Data Fig. lad). Optimized fixation conditions 
preserved both transcriptomes, as well as the GFP signal during the 
extended FACS procedure, while also sterilizing the sample (Extended 
Data Fig. le-h). 

To assess coverage of pathogen and host transcripts, we first 
sequenced total RNA without further depletion or enrichment of cer- 
tain RNA classes’, using Illumina technology. Of ~25 million RNA 
reads obtained from infected cells at 4h, 98.2% were of human ori- 
gin and 1.4% of Salmonella origin (Fig. 1b), confirming the predicted 
excess of eukaryotic over bacterial RNA’. Bacterial reads increased 
over time, reflecting intracellular replication of the pathogen (Fig. 1b; 
compare 4 and 24h GFP*). Reads from Salmonella or HeLa cells alone 
mapped to their respective genomes with high stringency (Fig. 1b) and 
individual transcript levels were highly reproducible amongst biolog- 
ical triplicates (Extended Data Fig. 1i). 

Collective sequencing of total RNA captured all major bacterial and 
eukaryotic transcript classes (Fig. 1b). Stable housekeeping ribosomal 
RNA (rRNA) transcripts were abundant in both transcriptomes, as 
were reads corresponding to transfer RNA (tRNA) in the Salmonella 
and stable small nucle(ol)ar RNAs (snRNAs, snoRNAs) in the human 
transcriptome. Other noncoding bacterial RNA classes, that is, sRNAs 
and antisense transcripts, contributed ~8% of the Salmonella reads; 
the abundance of human regulatory noncoding transcripts ranged 
from 0.1% for the ~22 nucleotide-long microRNAs (miRNAs) to 15% 
for the 200 to 100,000 nucleotide-spanning long noncoding RNAs 
(IncRNAs). 

Messenger RNAs, constituting 16-20% of all reads, predict differ- 
entially expressed proteins in both organisms during infection, which 
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Figure 1 | Dual RNA-seq captures the full transcript repertoire of 
infected cells. a, Dual RNA-seq workflow. b, Representative mapping 
statistics from sorted Salmonella-infected HeLa-S3 cells. LB, Salmonella 
grown in medium. ‘miRNA includes primary and mature forms; 
‘mitoRNA refers to mitochondrial transcripts. IGR, intergenic region; 


was consistent with published pathogen” and host”’ microarray data: 
expression of invasion-related genes of Salmonella pathogenicity island 
1 (SPI-1) decreased after bacterial internalization, whereas expression 
of SPI-2 genes promoting intracellular survival increased (Fig. 1c). In 
invaded host cells, NF-.B-associated immunity genes were strongly 
activated (Fig. 1d). Thus, dual RNA-seq of mixed total RNA reliably 
profiled both coding and noncoding RNA patterns in intracellular 
bacteria and their host cells. 


Dynamic sRNA expression in intracellular Salmonella 

To increase coverage of informative transcripts and to make dual 
RNA-seq more sensitive, we successfully depleted both bacterial and 
eukaryotic rRNA (Extended Data Fig. 2). We then profiled sRNAs in 
intracellular Salmonella with high resolution, analysing time-course 
samples taken before and 2, 4, 8, 16 and 24h after infection of HeLa 
cells (Extended Data Fig. 3a). Reads per kilobase transcript per mil- 
lion reads (RPKM) distributions revealed the relative abundance of 
bacterial and human RNA classes (Extended Data Fig. 3b). Altogether, 
we recorded expression of 145 known and 189 candidate Salmonella 
sRNAs, some of which were already induced greater than tenfold 2h 
after invasion (Fig. 2a). Expression changes of well characterized sRNAs 
provided insight into Salmonella’s microenvironment inside the host, as 
exemplified by RyhB and IsrE which are activated following iron scar- 
city”** or MicA/L, RybB and OmrA/B which all report bacterial surface 
stress”>°, In addition, decreased SPI-1 and increased SPI-2 expression 
after bacterial internalization (see Fig. 1c) is reflected in co-transcrip- 
tionally regulated sRNAs: InvR and DapZ were repressed>*, whereas 
MerR was activated””. 

The most activated sRNA, PinT (renamed from STnc440 (ref. 8) to 
PhoP-induced sRNA in intracellular Salmonella; see later) increased 
up to 100-fold during the infection, as well as in Salmonella grown 
in SPI-2-inducing medium which mimics the intracellular milieu”*® 
(Fig. 2a, b and Extended Data Fig. 3c). Likewise, PinT was prominently 
upregulated in dual RNA-seq experiments of 13 other host cell types, 
including murine bone marrow-derived macrophages, differenti- 
ated human THP-1 macrophages, and porcine macrophage-like cells 
(Fig. 2a, insets; Extended Data Fig. 4a, b). 

PinT is an 80 nucleotide-long sRNA®” from a horizontally acquired 
Salmonella-specific locus that also encodes RtsA, a co-activator pro- 
tein of invasion genes (Fig. 2c). However, a correlation analysis of all 
dual RNA-seq time points with global Salmonella regulons predicted 
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miscRNA, miscellaneous RNA. c, Compared to extracellular Salmonella 
(Oh), intracellular bacteria at 4h p.i. repress SPI-1 and induce SPI-2 
effector genes. d, Invaded (GFP) host cells at 24h p.i. activate NF-«.B- 
associated immunity genes. Data in panels c and d represent fold changes 
calculated by edgeR over 3 biological replicates. 


control by PhoP/Q (Fig. 2d), the SPI-2-activating two-component 
system that is essential for intracellular survival?®’. Supporting this 
prediction, PinT expression was abrogated in a AphoP strain and by 
mutations in the putative PhoP box in the pinT promoter (Extended 
Data Fig. 3d, e). Thus, PinT is a PhoP-dependent Salmonella sRNA that 
is highly activated upon infection of diverse host cells. 


PinT sRNA times virulence mRNA expression 

PinT exemplifies the current difficulty in understanding sRNA func- 
tions in bacterial virulence: an Hfq-bound sRNA of unknown func- 
tion®*°, PinT was selected as a potential Salmonella virulence factor 
in genome-wide random mutagenesis screens (TraDIS) in pigs and 
cattle’, two large-animal models of salmonellosis. Both its sequence 
conservation (Extended Data Fig. 3f) and its strong induction inside 
host cells further support a role for this sRNA in bacterial virulence. 
However, as the pinT gene was not selected by TraDIS in mice’? and 
its deletion produces weak macroscopic phenotypes in cultured cells 
(Extended Data Fig. 3g, h and Extended Data Fig. 4c-e), assays to deci- 
pher its molecular function during infection are currently lacking. 

Combining several approaches, we discovered a molecular func- 
tion of PinT as a timer of virulence gene expression, as summarized 
in Fig. 3a. First, a dual RNA-seq time-course using a ApinT strain 
to infect HeLa cells enabled the prediction that this sRNA represses 
SPI-2 genes during the early stages after host cell invasion (Fig. 3b and 
Extended Data Fig. 5a). The PinT-mediated repression of SPI-2 was 
independently validated for several transcripts for the secretion appa- 
ratus and effector proteins and by the rescue of wild-type expression 
profiles upon sRNA trans-complementation (strain pinT+) (Extended 
Data Fig. 5b). A dual RNA-seq time-course of Salmonella-infected cells 
from pigs, an organism in which PinT scored as a potential virulence 
factor!>, also confirmed the PinT effect on SPI-2 (Fig. 3b and Extended 
Data Fig. 5a, c). Second, mRNA profiling suggested that PinT acts 
upstream of the SPI-2 master transcription factor, SsrB, without affect- 
ing its own activator PhoP/Q or the invasion gene regulator, HilD 
(Extended Data Fig. 5b). Third, we successfully recapitulated PinT- 
mediated repression of SPI-2 genes under defined in vitro conditions 
(Fig. 3b and Extended Data Fig. 5a). 

We reasoned that as PinT associates with Hfq?’, it regulates 
mRNAs by base pairing*’. To identify early-infection mRNA tar- 
gets, we pulse-expressed PinT under pre-invasion conditions 
in vitro (Extended Data Fig. 6a and Supplementary Table 1). Two of 
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Figure 2 | A PhoP-dependent Salmonella sRNA induced inside host 
cells. a, Dual RNA-seq profiles of regulated Salmonella sRNAs in HeLa-S3 
cells (adjusted P value < 0.05). Insets, PinT sRNA expression (red lines) 

in different macrophage-like cells. b, Northern blot detection of PinT. 

5S and U6 RNAs represent loading controls. ON, overnight. ‘Inoculum, 
bacteria in host cell medium before infection. Uncropped gel image 

in Supplementary Fig. 1. c, pinT locus and sRNA secondary structure. 


the top repressed mRNAs encoded the secreted SPI-1 effectors SopE 
and SopE2. Conversely, the sopE/E2 mRNAs were found to be de- 
repressed in ApinT Salmonella during host cell invasion (Extended 
Data Fig. 5b). SopE and SopE2 are guanidyl nucleotide exchange fac- 
tors that stimulate innate immune responses*? and contribute to the 
establishment of Salmonella’s replicative niche**. They were readily 
depleted by overexpression of PinT, while other SPI-1 effectors such 
as SopB or SipC were unaffected (Fig. 3c), arguing for selective regu- 
lation. In silico modelling predicted that PinT base pairs near the start 
codon of the sopE and sopE2 mRNAs, which was confirmed by com- 
pensatory mutations in gfp reporter constructs of these targets, and 
within the single-stranded ‘seed’ region of the sRNA (PinT* variant; 
Extended Data Fig. 6b, c). 

To identify relevant PinT targets upon host cell entry, we successfully 
pulse-expressed PinT in bacteria growing inside host cells and 
also analysed PinT overexpression in the SPI-2-inducing medium 
(Extended Data Fig. 6d, e and Supplementary Table 1). The combined 
data sets enabled the prediction that PinT also directly represses the 
mRNAs encoding the proteins GrxA (glutathione/glutaredoxin sys- 
tem) and CRP (cyclic AMP receptor protein), by using the same seed 
sequence as above (Fig. 3c and Extended Data Fig. 6f). 

As CRP and GrxA contribute to virulence gene activation in 
intracellular Salmonella*>~*’, they might mediate PinT signalling to 
SPI-2. To address this, we monitored regulation of the well-known 
SsrB-activated SPI-2 gene ssaG”® after a switch from SPI-1- to SPI-2- 
inducing media (Fig. 3d). This in vitro assay recapitulated the prema- 
ture activation of SPI-2 previously seen in intracellular ApinT bacteria 
(Fig. 3b). Reciprocally, overexpression of PinT impaired SPI-2 gene 
activation. Importantly, this regulation was lost upon genomic deletion 
of crp but not of grxA (Fig. 3d), establishing the metabolic regulator 
CRP as a mediator of PinT control of SPI-2. 

Together these data suggest that PinT shapes the transition from 
invasion to the intracellular replication state of Salmonella by simul- 
taneously acting on two SPI-1 effectors and the SPI-2 virulence genes. 
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d, PinT as part of the PhoP regulon. Left, 14 analysed global regulons 
(Supplementary Table 1). Box plots show the interquartile range (IQR), 
with the median marked (red line). Whiskers indicate the highest/lowest 
point within 1.5 x IQR of the upper/lower quartile. Right, expression 

of PhoP target genes during the course of infection. Sequencing data in 
panels a and d are derived from 3 biological replicates. 


This model was supported with evidence at the protein level of an 
incomplete clearance of the SPI-1 effector SopE and a faster accumu- 
lation of SPI-2 effector SteC in the ApinT bacteria upon shifting from 
SPI-1 to SPI-2 conditions, as compared to PinT-expressing Salmonella 
(Fig. 3e). Intriguingly, although previous work discovered transcrip- 
tional loops that time SPI-2 expression**, our study reveals that PinT 
provides a post-transcriptional repressor arm of PhoP in a complex 
feedback loop that helps Salmonella transit from SPI-1 to SPI-2 activity 
upon internalization (Fig. 3a, f). This makes PinT the first sRNA, to 
our knowledge, known to temporally shape the transition between two 
major bacterial virulence programs. 


PinT impacts the host response 

To understand how temporal virulence factor control by PinT may 
impact the host, we interrogated the dual RNA-seq time-course data 
for PinT-dependent changes in the transcriptomes of infected HeLa 
cells. Strikingly, wild-type and ApinT Salmonella elicit very differ- 
ent expression patterns amongst the 14,001 mRNAs, 3,982 IncRNAs 
and 134 miRNAs detected for the host (Fig. 4a and Extended Data 
Fig. 7a, b). Moreover, while PinT affects bacterial virulence gene expres- 
sion early post-invasion of HeLa cells (Fig. 3b), its impact on the host 
transcriptome is apparent throughout the course of the infection 
(Fig. 4a); even before a small replication phenotype of the ApinT strain 
has manifested (Extended Data Fig. 3h). 

Interspecies correlation analysis of pathogen and host transcriptome 
changes (see Methods) provided a potential molecular scenario of how 
PinT affects pathogen-host interplay by linking PinT-regulated SPI-2 
gene expression in the bacteria with factors involved in the JAK-STAT 
pathway in the host (Fig. 4b and Supplementary Table 1). Specifically, 
we predicted and validated an accelerated activation of a key regulator 
of JAK-STAT signalling, Suppressor of cytokine signalling 3 (SOCS3), 
in the absence of PinT (Fig. 4a, c and Extended Data Fig. 7c). SOCS3 
inhibits the phosphorylation of the STATS (signal transducers and acti- 
vators of transcription 3) transcription factor to prevent its activation 
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Figure 3 | PinT temporally controls Salmonella virulence genes. 
a, Model of virulence gene expression regulation by PinT (based on 
panels b-e). Green, SPI-1 branch; blue, SPI-2. b, Relative expression of 
SPI-2 genes between wild-type Salmonella and the ApinT or pinT+ 
mutant strains. Box plots represent median expression fold changes of 
the regulon (whiskers: 1.5 IQR of the upper/lower quartile). Validated 
transcripts (Extended Data Fig. 5b, c) are labelled. c, Western blot 
analysis of PinT targets (identified in Extended Data Fig. 6). 
Overexpression of wild-type (PinT) or point-mutated (PinT*) sRNA 
under SPI-1 (LB, OD¢0 = 2.0) or SPI-2 conditions (SPI-2 medium, 
OD¢00 = 0.3). Empty vector and GroEL are negative and loading controls, 


and nuclear import*’. Consistent with premature induction of SOCS3 
(Fig. 4c), we observed reduced STAT3 phosphorylation in cells infected 
with ApinT Salmonella compared to wild-type infection (Fig. 4d). 

Dual RNA-seq revealed additional impact of PinT activity on 
host immune pathways; for example, increased mRNA abundance 
of the pro-inflammatory chemokine interleukin 8 (IL8, also known 
as CXCL8) in HeLa cells infected with ApinT bacteria, which was 
further confirmed on both RNA and protein levels (Extended Data 
Fig. 7a, d, e). Interestingly, elevated IL8 mRNA levels spread 
to bystander cells in ApinT-infected cultures (Extended Data 
Fig. 7d), probably due to paracrine immune signalling”. Importantly, 
both PinT-controlled SPI-1 effectors SopE and SopE2, as well as sev- 
eral SPI-2 effectors influence JAK-STAT signalling or IL-8 secre- 
tion’>*341_ These host responses are crucial for Salmonella to establish 
its intracellular replication niche’? and compete with the intestinal 
microbiota**“*, but also mediate bacterial killing®. Therefore, our 
observations would favour a model in which Salmonella uses PinT- 
mediated temporal control of virulence genes for optimal manipulation 
of these key host cell pathways to promote its own replication. 

Our dual RNA-seq data provides the first, to our knowledge, global 
map of both polyadenylated and non-polyadenylated host transcripts 
that respond to a bacterial infection (Extended Data Fig. 7a), pro- 
viding a temporal view of infection-related IncRNA expression and 
processing. Intracellular Salmonella affect the levels of ~44% of all 
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respectively. Uncropped gel image in Supplementary Fig. 1. d, PinT 
represses SPI-2 through CRP. Salmonella strains with endogenous (black), 
ectopic (grey) or lacking PinT expression (red) in either wild-type or 
deletion background were shifted from SPI-1 to SPI-2 conditions (see 
Methods). A transcriptional reporter fusion of the SPI-2 gene ssaG to gfp 
was used as a proxy for SPI-2 induction. e, Western blot quantification of 
sopE::flag/steC::flag Salmonella shifted from SPI-1 to SPI-2 conditions. 
Protein levels were normalized to GroEL. Points and error bars indicate 
the mean +s.d. from 3 biological replicates. n.d., not detected. f, Model 
of the temporal expression of virulence genes in the presence or absence 
of PinT. 


detected IncRNAs throughout the genome, almost half of which dif- 
fer between wild-type and ApinT infections (Fig. 4a and Extended 
Data Fig. 7a, b, f). Remarkably, certain IncRNAs appear to respond 
very quickly to PinT-dependent alterations (Fig. 4a), suggesting that 
IncRNAs can provide sensitive markers for pathogen activities in the 
early infection phase. 

Dual RNA-seq further permits the analysis of alternative splicing 
events, the formation of circular RNAs, and expression changes in 
small open reading frames; however, PinT generally had little, if any, 
impact on these processes (data not shown). However, in addition to 
pathogen and nuclear host transcripts, our protocol readily captures 
non-polyadenylated organellar transcripts (Fig. 1b), revealing that 
Salmonella strongly induces mitochondrial gene expression in invaded 
host cells (Extended Data Fig. 7a). Moreover, the ApinT strain caused 
hyperactivation of global mitochondrial RNA expression including 
the mitochondrial oxidative phosphorylation pathway and altered 
the subcellular localization of mitochondria (Fig. 4a, e and Extended 
Data Fig. 8). Although the underlying molecular mechanisms remain 
to be determined, these findings illustrate how a single sRNA affects 
host-pathogen interactions at different levels (Fig. 4f). Importantly, 
the combined data exemplifies how the generic analysis of all tran- 
script classes by dual RNA-seq can reveal changes at the cellular level 
to provide molecular insight into the roles of genes identified in patho- 
genesis screens using animal models. 
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Figure 4 | Effect of PinT activity on the host transcriptome. 

a, Differentially expressed HeLa-S3 transcripts (adjusted P value < 0.05; 

3 biological replicates) between wild-type and ApinT infections. Numbers 
reflect detected and differentially expressed transcripts, respectively. Full 
gene lists are in Supplementary Table 1. b, Interspecies co-regulation 
analysis (see Methods) correlates Salmonella SPI-2 genes with the human 
JAK-STAT pathway. Enriched bacterial and human GO terms are 
displayed. Complete gene sets in Supplementary Table 1. c, RT-PCR 
measurements of human SOCS3 mRNA in infected (GFP*) and bystander 
(GFP~) HeLa-S3 cells after wild-type, ApinT or pinT+ infection. Data 
represent results from 3 (2h) or 4 (other time points) biological replicates, 


Outlook 

Using dual RNA-seq, we have comprehensively charted the dynamic 
RNA expression landscape of both a bacterial pathogen and its eukar- 
yotic host during the course of infection. This approach enabled us to 
discover PinT as a post-invasion-activated sRNA whose function in 
host-pathogen interactions manifests itself in pervasive expression 
changes in all classes of host RNA. Many other sRNA loci are strongly 
induced upon host cell invasion (Fig. 2a). We selected six whose 
in vivo induction was recapitulated in SPI-2 medium (Extended Data 
Fig. 9a) for pairwise genomic inactivation. Of these, OmrA/B and 
RyhB/IsrE each control multiple mRNAs by Hfq-dependent base pair- 
ing**646, while YrlA/B program RNA decay in Salmonella*’. Whereas 
none of the AomrA/AomrB, AryhB/ AisrE, and AyrlA/AyrIB dele- 
tion strains displayed a phenotype in standard invasion/replication 
assays (Extended Data Fig. 9b, c), initial dual RNA-seq of HeLa cells 
infected with these double mutants revealed strain-specific changes 
in Salmonella transcripts (Extended Data Fig. 9d, e) and differential 
pathway activities in infected host cells (Extended Data Fig. 9f, g). 
While certain host pathways were generically de-regulated by all three 
mutants (Extended Data Fig. 9g), other pathways were specifically 
impacted by distinct deletion strains, suggesting that Salmonella 
sRNAs actively contribute to successful host infection by affecting 
common as well as disparate host pathways. 

Virulence factor screens have identified many Salmonella pro- 
tein-coding and sRNA genes whose molecular contributions to suc- 
cessful infection have remained unknown due to a failure to link a 
phenotype to its underlying mechanism'?"**8, Our findings with 
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asterisk denotes significant expression differences between wild-type- 

and ApinT-infected cells (P < 0.05; one-tailed Mann-Whitney U-test). 

d, Western blot of phosphorylation status of STAT3 in unsorted HeLa-S3 
cells at 16h p.i. with the indicated strains, including bacterial (GroEL) and 
human (tubulin) loading controls. Uncropped gel image in Supplementary 
Fig. 1. e, Representative dual RNA-seq coverage plot for two mitochondrial 
tRNA genes at 16h p.i. of HeLa-S3. f, Summary of how PinT-dependent 
regulation of bacterial effectors that affect SOCS3 (ref. 41) or, via Rho- 
GTPases, STATS (ref. 23), may influence host JAK-STAT signalling. STAT3 
also affects mitochondria®’, which potentially interconnects the different 
PinT-affected host pathways. SCV, Salmonella-containing vacuole. 


PinT and other intracellularly induced sRNAs illustrate how small 
perturbations in the infection process, such as dysregulation of a few 
Salmonella mRNAs, can propagate through the entire host system, 
potentially leading to different disease outcomes in the context of a 
whole organism. The one-step nature of dual RNA-seq should enable 
high-throughput studies to unravel such hidden gene functions simul- 
taneously in the pathogen and host, during infection with Salmonella 
and many other pathogens. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 


Data reporting. No statistical methods were used to predetermine sample size. 
The investigators were not blinded to allocation during experiments and outcome 
assessment. 

Salmonella strains and mammalian cell lines. Salmonella enterica serovar 
Typhimurium strain $L1344 constitutively expressing GFP from a chromo- 
somal locus (strain JVS-3858) was previously described*! and is referred to as 
wild type throughout this study. The complete list of bacterial strains used in this 
study is provided in Supplementary Table 1. Routinely, bacteria were grown in 
Lennox broth (LB) medium at 37 °C with shaking at 220 r.p.m. When appropriate, 
100,.gml~! ampicillin (Amp), 50j.gml~! kanamycin (Kan), or 201g ml! chlo- 
ramphenicol (Cm) (final concentrations) were added to the liquid medium or agar 
plates. Chromosomal mutagenesis of Salmonella SL1344 was performed as previ- 
ously described*. To construct a non-polar pinT mutant strain (YCS-034, GFP”; 
or JVS-10038, GFP*), the first ~60 nt of the gene were removed and replaced 
by a resistance cassette, while keeping the Rho-independent terminator intact. 
Then, the resistance cassette was eliminated using the FLP helper plasmid pCP20 
at 42°C*. All mutations were transduced into the wild-type background using 
P22 phage*. For plasmid transformation the respective Salmonella strains were 
electroporated with ~10ng of DNA. 

The following cell lines were used in this study: human cervix carcinoma cells 
(HeLa-S3; ATCC CCL-2.2), human epithelial colorectal adenocarcinoma cells 
(CaCo-2; ATCC HTB-37), human epithelial colorectal adenocarcinoma cells 
(HT29; DSMZ No. ACC-299), human stomach adenocarcinoma cells (AGS; 
ATCC CRL-1739), human epithelial colon metastatic cells (LoVo; ATCC CCL- 
229), human embryonic kidney 293 cells (HEK293; ATCC CRL-1573), human 
monocytic cells (THP-1; ATCC TIB-202), murine fibroblast cells (L929; ATCC 
CCL-1), murine embryonic fibroblast cells (MEF; ATCC SCRC-1040), mouse leu- 
kaemic monocyte/macrophage cells (RAW264.7; ATCC TIB-71), porcine intestinal 
epithelial cells (IPEC-J2)*, porcine macrophage-like cells (3D4/31)>. 

HeLa-S3, CaCo-2, THP-1, HEK293; RAW264.7 and MEF cells were obtained 
from the group of Thomas Rudel (Biocentre, Wiirzburg). AGS cells were pro- 
vided by Cynthia Sharma (Research Center for Infectious Diseases, Wiirzburg). 
L929 cells were obtained from Thomas Meyer (Max Planck Institute for Infection 
Biology, Berlin). HT29, LoVo, IPEC-J2 and 3D4/31 cells were provided by Karsten 
Tedin (Centre for Infection Medicine, Berlin). Cell lines have not been authenti- 
cated in our laboratory, but were routinely tested for mycoplasma contamination 
(MycoAlert Mycoplasma Detection Kit, Lonza). 

HeLa-S3 cells were cultured according the guidelines provided by the ENCODE 
consortium (http://genome.ucsc.edu/encode/protocols/cell/human/Stam_15_ 
protocols.pdf). Briefly, cells were grown in DMEM (Gibco) supplemented with 
10% fetal calf serum (FCS; Biochrom), 2mM L-glutamine (Gibco) and 1mM 
sodium pyruvate (Gibco) in T-75 flasks (Corning) in a 5% CO», humidified atmos- 
phere, at 37°C. Further cell lines used in this study (THP-1, CaCo-2, AGS, HT29, 
LoVo, HEK293, MEF, L929, RAW264.7, IPEC-J2 and 3D4/31) were cultured in 
RPMI (Gibco) supplemented with 10% FCS, 2mM L-glutamine, 1 mM sodium 
pyruvate and 0.5% 8-mercaptoethanol (Gibco) in a 5% CO>, humidified atmos- 
phere, at 37°C. To differentiate THP-1 monocytes, seeded cells (1 x 10° cells per 
well; six-well format) were treated with 50ngml! (final concentration) of phorbol 
12-myristate 13-acetate (PMA) (Sigma) for 72h (after 48h fresh PMA at the same 
concentration was added to the culture). 

For the differentiation of murine bone marrow derived macrophages (BMDMs), 
the marrow of femur and tibia was isolated from 8-12-week-old female C57BL/6 
wild-type mice and stored in RPMI supplemented with 10% FCS. The cell suspen- 
sion was centrifuged for 5 min at 250g and the leukocyte pellet was resuspended in 
differentiation medium consisting of X-vivo-15 medium (Lonza) supplemented 
with 10% FCS and 10% L929-conditioned DMEM medium (same composition 
as above). Cells were cultured at 3 x 10° cells per 10 ml in a T-75 flask. At day 3, 
another 3 ml of differentiation medium were added and cells were further cultured 
until day 5. Successful macrophage differentiation was validated by microscopy 
before the cells were detached using a rubber scraper (Sarstedt) and seeded into 
six-well plates at 10° cells per well in fresh differentiation medium. Infection was 
carried out on day 7 as described below. 

Salmonella infection assay. In vitro infection of HeLa-S3 cells was carried out 
following a previously published protocol** with slight modifications. Two days 
before infection 2 x 10° HeLa-S3 cells were seeded in 2 ml complete DMEM (six- 
well format). Overnight cultures of Salmonella were diluted 1:100 in fresh LB 
medium and grown aerobically to an ODgo9 of 2.0. Bacterial cells were harvested 
by centrifugation (2 min at 12,0001.p.m., room temperature) and resuspended in 
DMEM. Infection of HeLa-S3 cells was carried out by adding the bacterial suspen- 
sion directly to each well. If not mentioned otherwise, infections were performed 
at a multiplicity of infection (m.o.i.) of 5. Immediately after addition of bacteria, 
the plates were centrifuged for 10 min at 250g at room temperature followed by 


30 min incubation in 5% CO , humidified atmosphere, at 37°C. Medium was then 
replaced for gentamicin-containing DMEM (final concentration: 501g mI!) to kill 
extracellular bacteria. After a further 30 min incubation step, medium was again 
replaced by fresh DMEM containing 101g ml"! of gentamicin, and incubated for 
the remainder of the experiment. Time point 0 was defined as the time when 
gentamicin was first added to the cells. 

Further cell types were infected as described for Hela-S3 cells except that 

infection was carried out in RPMI medium and that infection was with an m.o.i. 
of 10 (THP-1, CaCo-2, HT29, AGS, HEK293, MEF, L929 and RAW264.7) or 
20 (IPEC-J2, 3D4/31), respectively. Infection of BMDMs was carried out with 
an m.o.i. of 10 and using X-vivo-15 medium (10% fetal calf serum, 10% L929- 
conditioned medium). 
Confocal laser scanning microscopy. Infection was carried out as described 
above, except that HeLa-S3 cells had been seeded onto coverslips (24-well format). 
At the respective timepoint, coverslips with infected HeLa-S3 were washed twice 
with PBS (Gibco) and fixed in 4% paraformaldehyde (PFA) for 15 min in a wet 
chamber. After two additional PBS washing steps, cells were stained with Hoechst 
33342 (Invitrogen; diluted 1:5,000 in PBS) for 15 min in a wet chamber and again 
washed twice with PBS. After coverslips had been air-dried, they were embedded in 
Vectashield Mounting Medium (Biozol) and analysed using the Leica SP5 confocal 
microscope (Leica) and the LAS AF Lite software (Leica). 

To stain human mitochondria, MitoTracker Orange CMTMRos (Life 

Technologies; kindly provided by V. Kozjak-Pavlovic, Biocentre, Wiirzburg) was 
used. The dye was added in the dark to a final concentration of 200 nM directly 
into the medium of the infected cells in the 37 °C incubator, 30 min before their 
harvest. After the 30 min incubation with the dye, the plates were covered with 
aluminium foil to prevent bleaching during the following steps. The supernatant 
was aspirated and the cells were washed with PBS and fixed with 4% PFA at 4°C 
overnight. Hoechst staining and sample preparation was performed as described 
above. 
Flow cytometry and fluorescence-activated cell sorting (FACS). For flow cytom- 
etry-based analyses, infected cultures were washed twice with PBS, detached from 
the bottom of the plate by trypsinization and resuspended in complete DMEM. 
Upon pelleting the cells (5 min at 250g, room temperature), they were resuspended 
in PBS and analysed by flow cytometry using a FACSCalibur instrument (BD 
Biosciences) and the Cyflogic (CyFlo Ltd; version 1.2.1) or Flowing (Cell Imaging 
Core, Turku Centre for Biotechnology, Finland; version 2.5.0) software, respec- 
tively. Selection of intact HeLa-S3 cells was achieved by gating based on cell diame- 
ter (forward-scatter) and granularity (side-scatter) (linear scale). Of those, infected 
(GEP-positive) and non-infected (GFP-negative) sub-fractions were defined based 
on GFP signal intensity (FITC channel) versus auto-fluorescence (PE channel) 
(logarithmic scale). 

For cell sorting, RNAJater-fixed cells (see below) were first passed through 

MACS Pre-Separation Filters (301m exclusion size; Miltenyi Biotec) and then 
analysed and sorted using the FACSAria III device (BD Biosciences) at 4°C (cool- 
ing both the input tube holder and the collection tube rack) and at a medium flow 
rate using the same gating strategy as described above, except that the gates for 
GFP-positive and GFP-negative fractions were conservative in order to prevent 
cross-contamination (as exemplified in Extended Data Fig. 1d). Typically ~2 x 10° 
cells of each fraction were collected for RNA isolation. 
Staining of apoptotic cells and cytotoxicity assay. To detect apoptotic cells, 
HeLa-S3 cells were washed twice with PBS and resuspended in 1 x binding buffer 
(BD Pharmingen) to a concentration of 10° cells per ml. 1001] of this cell sus- 
pension were mixed with 5 11 of APC-labelled annexin V (BD Pharmingen) and 
141 of 500mg ml! propidium iodide (PI; lyophilized stock from Sigma). Upon 
incubation for 15 min at room temperature, (light-protected) cells were subjected to 
flow cytometry using the MACSQuant Analyzer (Miltenyi Biotec). Upon gating of 
the fraction of intact cells based on cell diameter (forward-scatter) and granularity 
(side-scatter), the annexin-positive/PI-negative sub-population was determined by 
comparison against the appropriate single-stained controls in the APC vs PerCP 
channels, and quantified. 

Necrosis was evaluated by quantifying released lactate dehydrogenase (LDH) 
via the Cytotox96 assay (Promega) according to the manufacturer's instructions. 
The absorbance at 490 nm was measured using a Multiskan Ascent instrument 
(Thermo Fisher). In order to convert the measured absorbance values into the 
relative proportion of dead cells, the maximal absorbance was determined by using 
1x lysis solution (Promega) following the manufacturer’s instructions and referred 
to as 100% cytotoxicity. For both apoptosis and cytotoxicity measurements each 
biological replicate comprised three technical replicates. 

Quantification of intracellular replication (flow cytometry, c.f.u. assay). To 
quantify bacterial intracellular replication (Extended Data Fig. 1b), infected 
host cells were analysed by flow cytometry as described above, except that the 
increase in GFP intensity (geometric mean) was measured in the GFP-positive 
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sub-population over time and normalized to that of the non-infected population 
in the same sample (example in Extended Data Fig. 1c). 

Alternatively, infected HeLa-S3 cultures were solubilized with PBS containing 
0.1% Triton X-100 (Gibco) at the respective time points. Cell lysates were serially 
diluted in PBS, plated onto LB plates and incubated at 37°C overnight. The number 
of colony forming units (c.f.u.) recovered was compared to that obtained from the 
bacterial input solution used for infection. In all cases, each biological replicate 
comprised three technical replicates. 

Evaluation of different fixation techniques. Infected cells were washed twice with 
PBS, trypsinized and pelleted. For ethanol fixations, cell pellets were re-dissolved 
in 0.1 volume of ice-cold PBS and then 0.9 volume of ice-cold ethanol (either 70% 
or 100%; as indicated) were added in single droplets during shaking (400 r.p.m., 
4°C) to avoid cell clumping. Fixation using stop solution (95% EtOH/5% water-sat- 
urated phenol)*” was performed by resuspending the cell pellet in PBS before the 
addition of 0.2 volume of stop solution and mixing. When PFA was used, the 
pellet was resuspended in the respective PFA concentration (0.5% or 4% PFA, 
pH 7.4, with or without 4% sucrose) and shaken for 15 min at 4001.p.m., room 
temperature. PFA-induced crosslinks were reverted by an additional heating step 
for 15min at 70°C (refs 58, 59). For fixation with RNAlater (Qiagen), cell pellets 
were directly resuspended in RNAlater (1 ml per 5 x 10° cells). For systematic 
evaluation of different fixation protocols (Extended Data Fig. le-g), fixed cells 
had not been sorted but were either directly analysed upon fixation (30 min) or 
stored at —20°C (ethanol-based fixatives) or 4°C (others), respectively, overnight. 
To prepare RNAlater-fixed samples for sorting, tubes containing ~5 x 10° fixed 
cells were filled up with 10 ml of ice-cold PBS, centrifuged (5 min, 500g, 4°C) and 
cell pellets resuspended in 2 ml of cold PBS. This cell suspension was filtered and 
sorted (as described above). 

RNA extraction, DNase treatment, evaluation of RNA quality and qRT-PCR. In 
the dual RNA-seq experiments, as a reference for gene expression changes in host 
cells upon infection, a non-infected yet mock-treated control was included. The 
bacterial reference samples were derived from Salmonella grown in LB to an OD¢00 
of 2.0, which either were then shifted to DMEM for 15 min, pelleted and fixed in 
RNAlater (see above) or were fixed directly (that is, without a medium exchange 
step) as indicated. Fixed Salmonella cells were pelleted and lysed using the lysis/ 
binding buffer of the mirVana kit (Ambion). In order to maintain the approximate 
ratio of bacterial to host transcripts during RNA isolation, Salmonella lysates were 
mixed with host cell lysate in a way that the calculated proportion of individual 
Salmonella cells per infected host cell at the latest time point (see Extended Data 
Fig. 1h) was matched. The resulting mixture was then processed collectively. RNA 
was extracted from cells using the mirVana kit (Ambion) following the manufac- 
turer’s instructions for total RNA isolation. To remove contaminating genomic 
DNA, samples were treated with 0.25 U of DNase I (Fermentas) per 11g of RNA 
for 45 min at 37°C. If applicable, RNA quality was checked on the Agilent 2100 
Bioanalyzer (Agilent Technologies). 

For qRT-PCR experiments total RNA was isolated using the TRIzol LS reagent 
(Invitrogen) according to the manufacturer’s recommendations and treated with 
DNase I (Fermentas) as described above. qRT-PCR was performed with the Power 
SYBR Green RNA-to-CT 1-Step kit (Applied Biosystems) according to the manu- 
facturer’s instructions. Fold changes were determined using the 24°“ method™. 
Primer sequences are given in Supplementary Table 1 and their specificity had 
been confirmed using Primer-BLAST (NCBI). For the estimation of Salmonella 
RNA within infection samples (Extended Data Fig. 1h), a dilution series of sepa- 
rately isolated Salmonella and HeLa-S3 total RNA was set up and in each case the 
ratio of rfaH/ACTB mRNAs was determined. The same was done for biological 
samples from infected cells as well as for the Salmonella reference controls. From 
the resulting trend-line equation the approximate proportion of the Salmonella 
transcriptome within mixed prokaryotic and eukaryotic total RNA samples could 
be deduced. 
rRNA depletion (Ribo-Zero treatment). Where indicated (Supplementary Table 1), 
Salmonella and eukaryotic host rRNA were removed using the Ribo-Zero Magnetic 
Gold Kit (Epidemiology) purchased from Epicentre/Illumina. Following the man- 
ufacturer’s instructions, ~500 ng of total, DNase-I-treated RNA from infection 
samples was used as an input to the ribosomal transcript removal procedure. 
rRNA-depleted RNA was precipitated in ethanol for 3h at —20°C. 
cDNA library generation and (dual) RNA-seq. cDNA libraries for Illumina 
sequencing were generated by Vertis Biotechnologie AG, Freising-Weihenstephan, 
Germany. For dual RNA-seq of total RNA, at least 100 ng RNA were used for 
cDNA library preparation. DNase-I-treated total RNA samples were first 
sheared via ultra-sound sonication (4 pulses of 30s at 4°C each) to generate 
~200-400 bp (average) fragmentation products. Fragments <20 nt were removed 
using the Agencourt RNAClean XP kit (Beckman Coulter Genomics). As an 
internal quality control for the pilot experiment (shown in Fig. 1), spike-in 
RNA (5/-AAAUCCGUUCGUACGGGCCC-3’; 5’-monophosphorylated and 
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gel-purified) was added to a final concentration of 0.5%. The samples were poly(A)- 
tailed using poly(A) polymerase and the 5’ triphosphate (or eukaryotic 5’ cap) 
structures were removed using tobacco acid pyrophosphatase (TAP). Afterwards, 
an RNA adaptor was ligated to the 5’ monophosphate of the RNA fragments. First- 
strand cDNA synthesis was performed using an oligo(dT)-adaptor primer and the 
M-MLV reverse transcriptase (NEB). The resulting cDNA was PCR-amplified to 
about 20-30 ng il! using a high fidelity DNA polymerase (barcode sequences for 
multiplexing were part of the 3’ primers). The cDNA library was purified using 
the Agencourt AMPure XP kit (Beckman Coulter Genomics) and analysed by 
capillary electrophoresis (Shimadzu MultiNA microchip electrophoresis system). 

cDNA libraries for dual RNA-seq on rRNA-depleted samples were con- 
structed as described above, except for the following modifications. Upon RNA 
fragmentation, dephosphorylation with Antarctic Phosphatase (AP, NEB) and 
re-phosphorylation with T4 Polynucleotide Kinase (PNK, NEB) were performed. 
Oligonucleotide adapters were ligated to both the 5’ and 3’ ends of the RNA sam- 
ples. First-strand cDNA synthesis was performed using M-MLYV reverse tran- 
scriptase and the 3/ adaptor as primer. 

cDNA libraries from Salmonella-only samples were generated by fragmenting 
51g of total RNA using ultrasound and RNAs <20 nt were removed using the 
Agencourt RNAClean XP kit (Beckman Coulter Genomics) as above. The RNA 
samples were poly(A)-tailed and 5’ppp structures were removed as before. 
RNA adapters were ligated to the 5’ monophosphate of the RNA and first-strand 
cDNA synthesis was performed using an oligo(dT)-adaptor primer and the 
M-MLYV reverse transcriptase. The resulting cDNAs were PCR-amplified, purified 
using the Agencourt AMPure XP kit (Beckman Coulter Genomics) and analysed 
by capillary electrophoresis (Shimadzu MultiNA microchip). 

Generally, for sequencing cDNA samples were pooled in approximately equimo- 
lar amounts. The cDNA pool was size-fractionated in the size range of 150-600 bp 
using a differential clean-up with the Agencourt AMPure kit. For the dual RNA- 
seq pilot experiment (Fig. 1), single-end sequencing (100 cycles) was performed 
onan Illumina HiSeq 2000 machine at the Max Planck Genome Centre Cologne, 
Cologne, Germany. For dual RNA-seq on rRNA-free samples as well as for con- 
ventional RNA-seq of Salmonella-only samples, single-end sequencing (75 cycles) 
was performed on a NextSeq500 platform at Vertis Biotechnologie AG, Freising- 
Weihenstephan, Germany. 

All RNA-seq data discussed in this publication have been deposited in NCBI's 
Gene Expression Omnibus and are accessible through GEO Series accession 
number GSE60144. For the accession numbers of individual experiments, see 
Supplementary Table 1. 

Northern blotting (of bacterial and human transcripts). Total RNA prepared 
with TRIzol LS reagent (Invitrogen) was separated in 6% (vol/vol) polyacryla- 
mide-8.3 M urea gels and blotted as described". We loaded per lane either 5-10j1g 
of RNA from pure bacterial samples (Extended Data Figs 3d and 9a), 21g total 
RNA from sorted cell samples (Extended Data Fig. 8b), or 501g total RNA from 
unsorted infection samples (Fig. 2b). Hybond XL membranes (Amersham) were 
hybridized at 42°C with gene-specific [**P] end-labelled DNA oligonucleotides 
(see Supplementary Table 1 for sequences) in Hybri-Quick buffer (Carl Roth AG). 
Mutational analysis of the pinT promoter region. The pinT promoter region was 
amplified by PCR using primers JVO-7036/-7037 and inserted via the AatII and 
Nhel sites in the backbone of plasmid pAS093, resulting in plasmid pYC65. To 
identify the PhoP binding sites in a minimal fragment, the pinT promoter region 
was truncated by amplifying pYC65 using Phusion polymerase (NEB) with JVO- 
9393/-7387. The critical residues in the PhoP binding motif (T_27T_2s) were 
mutated to adenines by site-directed mutagenesis with JVO-12461/-12462 and 
Phusion polymerase (NEB). 

sRNA pulse-expression (in vitro and in vivo). For pulse-expression of PinT in 
in vitro grown Salmonella, we used arabinose-induced overexpression of PinT from 
a pBAD plasmid previously described'®°!*! with minor modifications. Briefly, 
wild-type Salmonella that carried either a pk P8-35 (pBAD control), pYC5-34 
(pBAD-PinT) or pYC60 (pBAD-PinT*) plasmid were grown overnight in LB and, 
the next day, the cultures were 1:100 diluted and further grown in LB to an OD¢00 
of 2.0. L-arabinose (Sigma) was added to a final concentration of 0.2%; 5 min later 
RNA was extracted using TRIzol LS reagent (Invitrogen) and analysed by RNA- 
seq (~3-5 million reads/library). For the same experiment under SPI-2-inducing 
conditions, overnight cultures of the three strains were washed 2x with PBS and 
1x with SPI-2 medium’, diluted 1:50 in SPI-2 medium and grown to an ODg00 
of 0.3 before PinT expression was induced as above. 

For the pulse-expression of PinT inside host cells (Extended Data Fig. 6d, e), 
HeLa-S3 cells were infected with the same three strains as above and 4h after 
infection, 0.2% L-arabinose was supplemented directly into the DMEM medium. 
Activation of inducible sRNA expression in intracellular bacteria was confirmed 
by qRT-PCR over a time-course of 20 min (Extended Data Fig. 6d), demonstrating 
full induction levels to be reached already at 5 min. Thus, for Extended Data Fig. 6e 
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the host cells were lysed at 5 min after induction with ice-cold 0.1% Triton X-100/ 
PBS and further incubated for 30 min on ice with pipetting up and down from 
time to time to improve host cell lysis efficiency. Then the intact bacterial cells 
were pelleted by centrifugation for 2 min at 16,100g (4°C) and resuspended in 
RNAlater (Qiagen). The fixed bacterial cells were further enriched against the host 
background via cell sorting (FACSAria III, BD Biosciences) and selective gating 
for the fraction of GFP* bacterial cells released from their hosts. From those, total 
RNA was isolated and analysed by RNA-seq as above except that sequencing was to 
a depth of ~20 million reads per library as necessitated by remaining host-derived 
RNA fragments. 

Western blotting (of bacterial and human proteins). Immunoblotting of 
Salmonella proteins was done as previously described. Briefly, samples from 
Salmonella in vitro cultures were taken corresponding to 0.4 OD¢oo; centrifuged 
for 4 min at 16,100g at 4°C, and pellets resuspended in sample loading buffer to a 
final concentration of 0.01 OD per tl. After denaturation for 5 min at 95°C, 0.05- 
OD equivalents of the sample were separated via SDS-PAGE. Gel-fractionated 
proteins were blotted for 90 min (0.2 mA per cm’; 4°C) ina semi-dry blotter 
(Peqlab) onto a PVDF membrane (Perkin Elmer) in transfer buffer (25 mM Tris 
base, 190 mM glycin, 20% methanol). Blocking was for 1h at room temperature 
in 10% dry milk/TBST20. Appropriate primary antibodies (see Supplementary 
Table 1) were hybridized at 4°C overnight and — following 3 x 10 min washing 
in TBST20 - secondary antibodies (Supplementary Table 1) for 1h at room 
temperature. 

For western blotting of human proteins, infected cells were harvested in sam- 
ple loading buffer (500 11 per well; six-well format), transferred to 1.5 ml reaction 
tubes, boiled for 5 min at 95°C and 20,1L per lane were loaded onto a 10% PAA gel 
for SDS-PAGE as above. After blotting and blocking (as above), the membrane was 
probed with the respective primary antibody at 4°C overnight and—upon washing 
(as above)—with the secondary antibody for 1h at room temperature (a full list 
with information about all antibodies and sera used is given in Supplementary 
Table 1). 

After three additional washing steps for each 10 min in TBST20, blots were 
developed using western lightning solution (Perkin Elmer) in a Fuji LAS-4000. 
In Fig. 3e, intensities of protein bands were quantified using the AIDA software 
(Raytest, Germany) and normalized to GroEL levels. 

In vitro SPI-1 to SPI-2 switch assay. To mimic the early stages of the infection of 
a host cell in vitro, the indicated Salmonella strains were grown in LB overnight, 
diluted 1:100 in LB and grown to an ODg09 of 2.0 (that is, a condition under which 
SPI-1 is highly induced*!!), washed twice with PBS and once with SPI-2 medium”® 
at room temperature, diluted 1:50 in pre-warmed SPI-2 medium (defined as fo) 
and grown further in Erlenmeyer flasks at 37 °C for the indicated time periods. At 
the respective time points, samples were taken for RNA-seq, western blotting, and 
GFP fluorescence measurements. 

Bacterial GFP reporter measurements (flow cytometry, plate reader). To meas- 
ure the GFP intensity of reporter strains, bacteria were grown in LB in presence of 
Amp and Cm until an OD¢o9 of 2.0 was reached. Salmonella cells corresponding 
to 1 ODgoo were pelleted and fixed with 4% PFA. GFP fluorescence intensity was 
quantified for each 100,000 events by flow cytometry with the FACSCalibur instru- 
ment (BD Biosciences). Data were analysed using the Cyflogic software (CyFlo). 

To monitor SPI-2 activation in real time, a transcriptional gfp reporter was con- 
structed by inserting the SPI-2-dependent ssaG promoter into plasmid pAS0093 
via AatII/Nhel sites as previously described®. The resulting plasmid pYC104 
was co-transformed with either the pBAD-ctrl. or pBAD-PinT plasmid into the 
indicated strain backgrounds. The resulting strains were grown overnight in LB 
(+Amp + Cm) and then diluted 1:100 and further grown in the same medium 
to an ODgoo of 2.0. A volume of 1 ml of the culture was pelleted and the collected 
cells shifted to SPI-2 medium”® (defined as fy) as described above, except that 
the growth experiment was conducted in 96-well plates (Nunc Microwell 96F, 
Thermo Scientific). After measuring the OD¢o9 and GFP intensity at fo, L-arab- 
inose was added to each well to final concentration of 0.2% for sRNA induction 
and bacteria were grown for 20h at 37°C (with shaking) with measurements of 
both the ODs95 and GFP fluorescence in 10 min intervals using the Infinite F200 
PRO plate reader (Tecan). 

Enzyme-linked immunosorbent assay (ELISA). HeLa-S3 cells were infected with 
wild-type Salmonella, ApinT or pinT+ mutant strains at an m.o.i. of 5 as described 
above. Culture supernatant samples were taken at 20h p.i. and analysed using the 
ELISA kit for human CXCL8/IL-8 (R&D Systems). 

Bioinformatic analyses. Code availability. In order to document the details and 
parameters of the (dual) RNA-seq data analyses and to make the biocomputa- 
tional approaches reproducible for others, we implemented the workflows as Unix 
Shell scripts. These scripts are deposited at Zenodo (DOI: 10.5281/zenodo.34695, 
https://zenodo.org/record/34695). Please refer to Supplementary Table 1 for 
descriptions of the analyses. 


Read processing and mapping. For all RNA-seq experiments listed in 
Supplementary Table 1, lumina reads in FASTQ format were trimmed with a 
Phred quality score cut-off of 20 by the program fastq_quality_trimmer from 
FASTX toolkit version 0.0.13 (http://hannonlab.cshl.edu/fastx_toolkit/). Reads 
shorter than 20 nt after adaptor- and poly(A)-trimming were discarded before 
the mapping. The reads were aligned to the Salmonella enterica SL1344 genome 
(NCBI RefSeq accession numbers: NC_016810.1, NC_017718.1, NC_017719.1, 
NC_017720.1) and—where applicable—the human (hg19 - GRCh37; retrieved 
from the 1000 Genomes Project®™), the mouse (GENCODE M2, GRCm38.p2), or 
the porcine genome sequence (ENSEMBL, Sscrofa10.2), in parallel. The mapping 
was performed using the READemption pipeline (version 0.3.5)! and the short 
read mapper segemehl and its remapper lack (version 0.2.0) allowing for split 
reads®, Mapped reads with an alignment accuracy <90% as well as cross-mapped 
reads, that is, reads which could be aligned equally well to both host and Salmonella 
reference sequences, were discarded. The resulting data were used for visualization 
(see for example, Fig. 1b and Extended Data Fig. 2b). 

Reads of the high resolution time-course experiment (cDNA libraries num- 
bers 27-77 in Supplementary Table 1) that were detected as cross-mapped by 
READemption (see above) were further inspected: their median percentage over 
the entire time-course was 0.25% with increased fractions for the later time points, 
implying that those reads are mainly contributed by Salmonella cells. We observed 
that the majority of the cross-mapped reads aligned to Salmonella rRNA or tRNA 
loci, while on the human side no gene class preference was observed (data not 
shown). 

Differential gene expression analysis. For dual RNA-seq experiments (CDNA 
libraries 1-184, 215-256 in Supplementary Table 1) after mapping differential 
expression analysis was carried out separately for the host and the pathogen. 
Strand-specific gene-wise quantifications for each data subset were performed by 
READemption™. Host transcript expression analyses are based on annotations 
from GENCODE (version 19)°’, NONCODE (version 4)°* and miRBase (version 
20) after removing redundant entries. The annotation for Salmonella genes was 
retrieved from NCBI (under the above mentioned accession numbers) and man- 
ually extended with small RNA annotations*”’. In either organism, multi-mapped 
reads were removed and only uniquely mapped reads were considered for the 
expression analysis. Differential gene expression analyses were performed with 
the edgeR package (version 3.10.2)”! using an upper-quartile normalization and 
a prior count of 1. Where needed (that is, to correct for batch effects in the com- 
parisons between wild-type and mutant infections; the comparisons displayed 
in Figs 3 and 4 and Extended Data Figs 5, 7-9), sequencing data were further 
normalized using the RUVs correction method” with k= 3. For this purpose, we 
treated the samples time-point-wise to remove unwanted nuisance factors. At each 
time point our covariate of interest was the pinT status of the infecting bacterium. 
This is constant within replicate blocks, which are used for the RUVs correction. 
Host or bacterial genes with at least 10 uniquely mapped reads in three replicates 
were considered detected. Genes with an adjusted P value < 0.05 were considered 
differentially expressed. Differential expression analysis for conventional (bacteria 
only) RNA-seq experiments (cDNA libraries numbers 185-214 in Supplementary 
Table 1) was done similarly, except that a cut-off of >50 uniquely mapped reads 
was used as a detection threshold. 

Read coverage plots. Based on the obtained BAM files, coverage files in wiggle 
format were generated by READemption™ in a strand-specific manner and split 
by organism. In each case, coverage files are based on uniquely mapped reads and 
normalized by the total number of uniquely aligned reads per organism. For Fig. 4e, 
wiggle files were visualized using the Integrated Genome Browser (version 8.4.4)”°. 
Pathway enrichment and further analyses for Salmonella. A database of path- 
ways, regulons, and genomic islands was constructed using information obtained 
from the KEGG database”* (organism code sey), the SL1344 genome annotation”, 
and relevant literature sources (see Supplementary Table 1). Pearson correlation 
coefficients between changes in PinT expression and changes in expression of each 
gene within each regulon over the time-course of wild-type Salmonella infection 
(cDNA libraries number 27, 30, 33, 36, 39, 42, 44, 47, 50, 53, 56, 59, 61, 64, 67, 70, 
73, 76 in Supplementary Table 1) were plotted in Fig. 2d. 

To assess enrichment of differentially expressed transcripts in pathways in the 
comparative infection experiments (cDNA libraries numbers 27-77 and 152-184 
in Supplementary Table 1) and the in vitro assay (CDNA libraries numbers 
185-202 in Supplementary Table 1), gene set enrichment analysis (GSEA; version 
2.1.0) was run on the log, fold changes reported by edgeR. The GSEA was performed 
in ranked list mode (with statistic classic) and gene sets containing less than 15 or 
more than 100 entries were excluded. Extended Data Fig. 5a reports all pathways 
significant at an FDR-corrected P value of at most 0.05 in at least one time point. 
Host pathway analysis and inspection of alternative isoform expression. Host 
pathway enrichment studies were performed consistently with bacterial analyses 
using GSEA on human pathways available in the KEGG database (downloaded 
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January 22, 2014) using the same settings described above. Pathways with an 
adjusted P value < 0.05 were considered to be significantly modulated. Data vis- 
ualization for Extended Data Fig. 8a was produced using the Bioconductor pack- 
age Pathview”’. Genes displayed in Fig. 1d, that is, genes whose transcription is 
known or predicted to be regulated by the binding of nuclear factor «B (NF-«B) 
to their promoter or genes whose products have been shown to promote an NF-kB 
response, were retrieved from the GeneCards”° and Boston University Biology 
(http://www.bu.edu/nf-kb/gene-resources/target-gene) databases or refs 77, 78. 
STATS target genes denoted in Fig. 4b were retrieved from ref. 79. 

We used Cufflinks/Cuffdiff (version 2.2.1)***" to test for differentially expressed 

isoforms in the high-resolution, comparative dual RNA-seq time-course data set 
(cDNA libraries number 27-77 in Supplementary Table 1). In a first step, we used 
Cufflinks to quantify transcript isoforms in the mapped read data. Afterwards, all 
transcript annotations were merged using Cuffmerge and differentially expressed 
isoforms were called using Cuffdiff. 
Interspecies correlation analysis of pathogen and host transcriptome changes. 
To identify bacterial and human genes with similar expression kinetics across the 
time-course of the infection of HeLa-S3 cells (cDNA libraries number 27-77 in 
Supplementary Table 1), we used RUVs-corrected, abundance-filtered and normal- 
ized read counts (see above). Absolute counts were then transformed into stand- 
ard z-scores for each gene over all considered samples as follows: for each gene, 
the z-score was calculated as the absolute read count minus the mean read count 
over all samples, divided by the standard deviation of all counts over all samples. 
Genes with a standard deviation <2 were excluded from further analysis. Pearson 
correlation coefficients were calculated between all remaining bacterial genes and 
all remaining human genes, and P values were calculated using the function cor. 
test in R. To account for a possible temporal delay between Salmonella expression 
changes and effect manifestation in the host cell, a time-shift was allowed. This 
means the expression of Salmonella genes at each time point was compared to 
host expression at the subsequent time point. Human genes were considered to 
be correlated with a bacterial gene if they had a P value of less than 10°-* anda 
Pearson's r greater than 0.65. This resulted in a total of 751 clusters of human 
genes showing correlation in expression with a bacterial gene, approximately half 
of which (see Supplementary Table 1) had at least one enriched GO term associated 
with them (adjusted P value < 0.05) as tested using the software tool Ontologizer 
2.0 (build: 20100310-351)*? with the gene ontology definition obtained from the 
Gene Ontology Consortium (data-version: releases/2015-09-26) and the Universal 
Protein Resource (UniProt) gene annotation (generated: 2015-09-14). 

To account for the possibility that multiple bacterial genes might be associated 
with a human gene cluster a correlation analysis was performed for all against all 
bacterial genes as described above, with the only exception that no time-shift was 
allowed. For this, we focused on seventeen gene clusters that were built on bacterial 
genes encoding for secretion-associated gene products (according to UniProt; see 
Supplementary Table 1). Detailed inspection of these clusters revealed the one 
depicted in Fig. 4b (centred on the bacterial SPI-2 gene sseC) which contained 
many further (bacterial and human) genes with pronounced PinT-dependent 
expression changes - that is, genes that showed differential expression between 
wild-type and ApinT infection at several time points p.i. 

Statistics. In all RNA-seq-based analyses, transcript expression changes that were 
associated with an adjusted P value < 0.05 (reported by edgeR) were considered 
significantly differentially expressed. For Fig. 3b, a Monte Carlo permutation test 
was performed on the median fold change of genes in the SPI-2 regulon, using 
10° randomly selected gene sets of the same size. This indicated the significant 
de-repression (P< 0.05) of the SPI-2 regulon in the absence of PinT at 2 and 8h 
after the infection of HeLa cells, at 2, 6 and 16h after the infection of 3D4/31 cells, 
and in the in vitro assay. Tests for the evaluation of increased host cell death in 
Extended Data Fig. 1a were performed using a one-tailed Student's t-test. *P val- 
ues < 0.05 were considered significant and ***P values < 0.001 were considered 
very significant. The significance of gene activation in (RT-PCR results in Fig. 4c 
and Extended Data Figs 5b, c and 7c, d or the ELISA assay in Extended Data Fig. 7e 
was assessed using a one-tailed Mann-Whitney U-test. The significance of differ- 
ences in intracellular replication between the ApinT strain and wild-type Salmonella 
(Extended Data Fig. 4d) was evaluated using a two-tailed Mann-Whitney U-test. 
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Extended Data Figure 1 | Establishment of the infection model with 
HeLa-S3 cells and constitutively GFP-expressing Salmonella. a, Rate of 
infected, apoptotic and cytotoxic HeLa-S3 cells over a range of different 
m.o.i.s. Left panel: infectivity increases with increasing bacterial doses. 
The discrepancy between the fractions of infected cells at 4h and 24h p.i. 
results from increasing levels of host cell death over time. Quantification 
of infectivity was achieved via flow cytometry (FACSCalibur, BD 
Biosciences) and the Cyflogic software (CyFlo) by gating for the GFP* and 
GFP~ populations. Middle: apoptosis measurement by annexin V-APC 
(BD Pharmingen)/propidium iodide (PI) (Sigma) staining followed by 
flow cytometry using the MACSQuant Analyzer device (Miltenyi Biotec). 
APC-positive/PI-negative cells were considered apoptotic. Right: lactate 
dehydrogenase (LDH) release as a proxy for necrosis in infected HeLa-S3 
cultures. The colorimetric product was quantified by measuring the 
absorbance at 490 nm. 100% host cell death was determined by treating the 
cell culture with lysis buffer before analysis. *P < 0.05; ***P < 0.001 (one- 
tailed Student’s t-test). b, Intracellular replication of Salmonella inside 
HeLa-S3 (m.o.i. of 5). Left panel, flow cytometry-based quantification 

of the increase in GFP intensity per infected host cell over time (see also 
panel c and Methods). Right: c.f.u. counts. The inset illustrates the relative 
amount of intracellular bacteria at 4 or 24h p.i., respectively, as compared 
to the input. Combination of infectivity (panel a) and c.f.u. data (panel b) 
allows for the calculation of the average number of intracellular bacteria 
per invaded cell at distinct time points: 4h p.i.: ~10 bacteria; 24h p.i.: 
~75 bacteria. c, Representative overlay histogram of flow cytometry data 
exemplifying the increase in GFP intensity per infected HeLa-S3 cell 

over time. Plot was generated using the Flowing software (Turku Centre 
for Biotechnology, Finland). d, Representative FACS plots showing the 
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gating strategy for the separation of invaded and non-invaded HeLa-S3 
cells. The signal detected in the phycoerythrin (PE) channel was used 

as a proxy for a cell’s autofluorescence. The percentage values indicate 

the relative proportion of GFP* and GFP ~ cells before (‘pre-sort’) and 
after sorting (‘re-analysis’). e, Capillary electrophoresis of total RNA 
samples from infected cells (4h p.i.) that were fixed overnight using 
different reagents. Stop solution refers to 95% EtOH/5% phenol. Where 
indicated (‘+sucr.) paraformaldehyde (PFA) was supplemented with 2% 
sucrose. The band pattern of pure Salmonella RNA is shown to the left. 
Note that in the infection samples bacterial rRNA bands are not visible 
due to the overwhelming host RNA background. For gel source data, see 
Supplementary Fig. 1. f, Influence of different preservatives on FACS- 
based recovery of invaded host cells through bleaching of the GFP signal. 
g, Increasing concentrations of RNAlJater kill intracellular Salmonella 
(black line) but do not compromise detection of GFP fluorescence (green 
bars). h, Extrapolation of the relative representation of Salmonella and 
human transcriptomes in infection samples. A dilution series of separately 
isolated Salmonella to HeLa-S3 total RNA was set up and for each ratio, 
bacterial rfaH (relative to human ACTB) mRNA was quantified by qRT- 
PCR. The resulting trend-line equation was used to infer the percentage 
of the Salmonella transcriptome within mixed total RNA samples from 
infected HeLa-S3 cells at different time points and without (blue) or upon 
FACS-based enrichment for invaded cells (red). The position of medium 
control samples (LB, DMEM) is given in grey. i, Normalized read counts 
for all detected Salmonella or human genes at 4h p.i. are plotted for 3 
biological replicate experiments and the Pearson correlation coefficient 
(r) is given. Panels a, b, f, g, h show the mean + s.d. from each 3 biological 
replicates. 
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Extended Data Figure 2 | Establishment of an rRNA-depletion step 

for dual RNA-seq. a, Experimental workflow. b, Comparative mapping 
statistics of dual RNA-seq samples without (upper panel) or upon the joint 
depletion of bacterial and eukaryotic rRNA using the Ribo-Zero Gold 
(Epidemiology) kit (lower panel). c, Gene-wise correlation between read 
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total RNA 


coverages without or upon rRNA removal for Salmonella (left) and human 
(right) data subsets. The Pearson's r is given. The red dots in squares 


represent the rRNA transcripts that had zero reads in the rRNA-depleted 
sample. 
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Extended Data Figure 3 | See next page for caption. 
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Extended Data Figure 3 | PinT is induced during infection via PhoP 
binding to its promoter region. a, Scheme of the comparative high- 
resolution time-course analysed by dual RNA-seq. For both the Salmonella 
strains, five individual time points post-invasion of HeLa-S3 cells were 
sampled and enriched for the fraction of invaded (GFP*) host cells. 
Mock-treated cells were used as a host, and extracellular Salmonella 

in DMEM medium (0h) asa bacterial reference control. Together this 
resulted in 17 different conditions which were sampled as biological 
triplicates. b, Average RPKM distribution over individual Salmonella 

or human transcript classes from the wild-type infection time-course. 

c, PinT activation during invasion of HeLa-S3 cells as revealed by dual 
RNA-seq (red graph) can be reproduced by qRT-PCR measurements 
(black dots; each dot represents a single out of 4 (2; 8; 16h), 5 (4h) or 

6 biological replicate experiments (24h)). Normalization was achieved 
using the constitutively expressed gfp mRNA as a reference. d, Northern 
blot detection of PinT in the Salmonella wild-type and various mutant 
backgrounds in which the indicated global regulators were deleted. 

For gel source data, see Supplementary Fig. 1. e, Mutational analysis 
identifies PhoP as a direct transcriptional activator of the pinT promoter. 
A transcriptional gfp fusion construct containing the pinT upstream 
promoter region (—41 bp to +5 bp) was analysed in the wild-type, 


phoP deletion or phoP complementation backgrounds. A non-fluorescent 
(‘no gfp’) or unrelated ompC promoter reporter served as negative 
controls and a phoP promoter reporter as a positive control (as PhoP is 
known to auto-regulate its own expression®). Two-nucleotide exchanges 
(T_27/-2s—A_27/-28) in the predicted PhoP binding site (see alignment in 
panel f) are sufficient to abrogate PhoP responsiveness of PinT expression. 
Error bars indicate the s.d. from the mean from biological triplicates. 

f, Sequence alignment shows the conservation of PinT sRNA within 

the genus Salmonella. “STY”: S. Typhi, “SEN”: S. Enteritidis, “SGA”: 

S. Gallinarum, “SAR”: S. arizonae, “SBG”: S. bongori. Perfectly conserved 
ribonucleobases are labelled in red, less conserved bases are shown in blue. 
The numbers indicate the position relative to the 5’ end of PinT 

(+1 position). Black lines and sequence motif below the alignment 
highlight a PhoP binding site (the asterisks denote thymines that were 
converted into adenines for mutational analysis in panel e). Asterisks 
below the seed sequence (position ~30-40) mark two guanines that 

were mutated to cytosines in Fig. 3c and Extended Data Fig. 6. 

g, h, Infection rate (g) and intracellular replication kinetics (h) of the 
indicated Salmonella strains in HeLa-S3 cells (m.o.i. of 5). Data are derived 
from flow cytometry measurements as described for Extended Data 

Fig. 1a, b, and refer to the mean +s.d. from 3 biological replicates. 
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Extended Data Figure 4 | See next page for caption. 
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Extended Data Figure 4 | PinT is strongly induced upon the invasion 

of diverse host cell types while its deletion only has slight effects in 

cell culture models. a, The given cell types were infected with wild-type 
Salmonella for the indicated time periods and total or rRNA-depleted RNA 
(as indicated in Supplementary Table 1) was sequenced. THP-1 cells either 
were differentiated by treating them with phorbol myristate acetate before 
infection (“+PMA) or were kept monocytic (“—PMA)). For all but HeLa-S3 
cells and porcine cell types infection was established at an m.o.i. of 10. 
IPEC-J2 and 3D4/31 cells were infected at an m.o.i. of 20 and HeLa-S3 

at an m.o.i. of 5. Shown are detected (>10 reads in each replicate; light 
grey) and regulated sRNAs (adjusted P value < 0.05; dark grey). The data 
was derived from 3 biological replicates for HeLa-S3 and 3D4/31 and 2 
biological replicates for the other cell types. PinT expression (red line) was 
significantly upregulated in all cell types. b, (RT-PCR validation of the 
induction of PinT in five selected host cell types. RT-PCR data from 

3 biological replicates were drawn (solid lines). Normalization was against 
gfp mRNA. The data from HeLa-S3 is the same as in Extended Data Fig. 3c. 


For comparison, in each case the RNA-seq-based expression data 

of PinT (as shown in panel a) are re-plotted (dashed curves). 

c, Invasion assays for wild-type, ApinT and pinT+ Salmonella with 
HeLa-S3 (m.o.i. 5), CaCo-2 and undifferentiated THP-1 (both m.o.i. 10), 
as well as IPEC-J2 and 3D4/31 (both m.o.i. 20). In all cases, the invasion 
rate was profiled 10 min p.i. by flow cytometry. d, Intracellular replication 
kinetics for the same strains and host cell types. The increase in GFP 
intensity of infected cells over time was monitored by flow cytometry and 
expressed as fold change compared to the fp time point (see Methods). 
The asterisk denotes a significantly different replication rate between the 
wild-type and ApinT strain (P < 0.05; two-tailed Mann-Whitney U-test). 
e, Host cell cytotoxicity measurements for the same host cell types upon 
infection with the indicated Salmonella strains. At the respective time 
points p.i., LDH activity in the supernatant of the infected cultures was 
quantified and the increase over time was with respect to the LDH activity 
measured in supernatants of mock-infected cells. The data in panels 

c-e represent the mean + s.d. from 3 biological replicates. 
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Extended Data Figure 5 | Salmonella PinT sRNA represses SPI-2 
expression during the infection of HeLa-S3 cells and pig macrophages. 
a, The heat map shows the result from gene set enrichment analyses of 
Salmonella gene expression data from the comparative dual RNA-seq 
time-course experiments with HeLa-S3 cells and porcine 3D4/31 
macrophages, and from the comparative RNA-seq experiment of 
Salmonella grown for 1h under SPI-2-inducing in vitro conditions. It 
reveals the de-repression of SPI-2 genes in the absence of PinT (ApinT) 
at several time points—a specific effect as SPI-2 expression reverts to 
wild-type levels upon trans-complementation of PinT (pinT+) in the in 
vitro assay. b, RT-PCR measurements during the early stages of HeLa-S3 


infection validates the de-repression of Salmonella SP1-2 genes in the 
ApinT background (red) as compared to both wild-type (black) and pinT+ 
(grey) strains. Normalization was performed using gfp mRNA. Dots 
represent individual biological replicate experiments (five for hilD, sseA, 
ssrB; four for the other mRNAs) and the solid lines indicate their mean. 
c, RT-PCR validation of de-repressed SPI-2 genes 6h after the invasion 
of pig macrophages. Porcine GAPDH mRNA was used for normalization. 
The data represent the results from biological triplicate measurements. 
The asterisks in panels b and c denote significantly increased transcript 
levels in ApinT compared to wild-type Salmonella (P < 0.05; one-tailed 
Mann-Whitney U-test). 
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Extended Data Figure 6 | PinT directly targets Salmonella sopE/E2, 
grxA and crp mRNAs. a, Volcano plot showing Salmonella mRNA levels at 
5 min after the pulse-expression of PinT under SPI-1-inducing conditions 
(LB medium, OD¢oo of 2.0). The data are derived from two biological 
replicates. Candidates that were confirmed to be directly targeted by 

PinT are coloured. Red: targets regulated predominantly under SPI-1 
conditions; blue: targets regulated (also) under SPI-2 conditions 

(see panel e). For the full list of changes in gene expression after the PinT 
pulse see Supplementary Table 1. b, RNA duplex formation between 

PinT and the sopE and sopE2 leaders as predicted by the RNA-hybrid 
program*, Point mutations introduced for compensatory base-pair 
exchange experiments are indicated. The ribosome binding site and start 
codon are marked in blue or red, respectively. c, Validation of the 
base-pair interactions as shown in panel b using translational sopE::gfp and 
sopE2::gfp reporter gene fusions by compensatory base-pair exchanges. 
Salmonella strains containing both a gfp reporter plasmid and an sRNA 
overexpression vector were grown overnight in LB and analysed by flow 
cytometry. The error bars indicate s.d. from the mean from biological 
triplicates. d, In vivo pulse-expression establishment. GFP-expressing 


Salmonella strains harbouring PinT sRNA under an L-arabinose-inducible 
promoter on a plasmid or corresponding control strains, respectively, were 
used to infect HeLa-S3 cells. At 4h p.i., L-arabinose was added to the cell 
medium. Samples were taken over a time-course of 20 min after the pulse 
and enriched for Salmonella transcripts (see Methods section). PinT sRNA 
levels were measured by qRT-PCR in the resulting RNA samples and are 
plotted (mean + s.d. from technical triplicates). e, Pulse-expression of PinT 
under in vivo(-like) conditions (the full data are in Supplementary Table 1). 
PinT was transiently overexpressed under SPI-2-inducing conditions in 
vitro or 4h after HeLa-S3 infection (see panel d). In either case, 5 min 

after induction total RNA samples were taken and analysed by RNA-seq 
(each two biological replicates). Axes represent fold changes in mRNA 
abundance between strains harbouring the empty and the PinT-containing 
plasmid. The two targets validated in panel f are labelled in blue. The celB 
(cellobiose-specific permease IIC component) and ecnB (entericidin B 
precursor) mRNAs might be further targets of PinT (see also panel a), but 
were not followed up here. f, Validation of direct targeting of grxA and 

crp mRNAs by the seed region of PinT using translational grxA::gfp and 
crp::gfp reporter gene fusions as described for panel c. 
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Extended Data Figure 7 | Host expression data for the comparative 
infection with wild-type or ApinT Salmonella. a, Heat maps showing 
differentially expressed mRNAs (upper panels), IncRNAs (middle) or 
miRNAs (lower) between the infection of HeLa-S3 cells with either one of 
the two indicated Salmonella strains and mock-infected controls. Plotted 
are all genes that were significantly differentially expressed (adjusted 

P value < 0.05; 3 biological replicates) between the indicated conditions 
for at least one time point during infection. Numbers to the right refer to 
detected transcripts and transcripts differentially expressed after 
wild-type infection vs mock, or after ApinT infection vs mock, 
respectively. b, PinT affects the expression of coding and noncoding 
transcripts of the human host. Venn diagrams indicate HeLa-S3 transcripts 
commonly or specifically regulated compared to mock for the respective 
infection strain. c, Infection with ApinT Salmonella leads to increased 
SOCS3 expression in porcine macrophages. qRT-PCR data (mean + s.d.) 
from biological triplicate experiments of the infection of 3D4/31 cells with 
Salmonella wild-type, ApinT, or pinT+. Total RNA samples were taken at 
6h p.i. Porcine GAPDH mRNA was used for normalization. d, RT-PCR 
measurement of human JL8 mRNA in total RNA samples isolated from 
HeLa-S3 cells at 16h after either wild-type, ApinT or pinT+ infection 


and each sorted into the fractions of invaded (GFP*) and non-invaded 
(GFP ) cells. U6 snRNA was used for normalization. The data represent 
the mean +s.d. from 3 biological replicates. Note that differential gene 
expression was not caused by different bacterial loads as judged from the 
ratio of gfp mRNA to U6 snRNA (‘bacterial load control’). e, Enzyme- 
linked immunosorbent assay for human IL-8 protein in supernatant 
samples from HeLa-S3 cells at 20h after their infection with the indicated 
Salmonella strains. Data refer to the mean + s.d. from biological triplicate 
measurements. f, Karyogram plot displaying the individual human 
(female) chromosomes and the genomic position of differentially 
expressed IncRNA candidates. IncRNAs differentially regulated (adjusted 
P value < 0.05; 3 biological replicates) in response to wild-type infection 
compared to mock-treated HeLa-S3 cells are indicated as grey bars and 
candidates differentially expressed between wild-type- and ApinT- 
infected cells as red bars. The position of the bars relative to the respective 
chromosome indicates the direction of regulation (above the chromosome: 
upregulation; below: downregulation at the earliest time point when 
regulation was observed). Panels c-e, asterisks denote significantly 
different transcript (c, d) or protein (e) levels between wild-type- and 
ApinT-infected cells (P < 0.05; one-tailed Mann-Whitney U-test). 
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Extended Data Figure 8 | Impact of PinT on host mitochondria. a, KEGG 
Pathview representation of oxidative phosphorylation in mitochondria. 
Dual RNA-seq data of infected HeLa-S3 cells (3 biological replicates) were 
plotted on top of the pathway map. Individual boxes represent the different 
time points sampled. Coloured boxes: at the respective time point the given 
gene(-cluster) was hyper-activated (red) or suppressed (blue) in ApinT- 
infected compared to wild-type-infected cells. b, Northern blot detection 

of mitochondrial tRNAs (shown in Fig. 4e) in HeLa-S3 cells infected for 
16h with the indicated Salmonella strains and sorted for GFP* and GFP~ 


fractions. Salmonella 5S rRNA serves as a bacterial and human U6 snRNA 

as a host control. The ‘Salmonella only’ sample demonstrates specificity of 

the probes against the respective human (but not bacterial) tRNAs. For gel 
source data, see Supplementary Fig. 1. c, Elevated mitochondrial expression in 
response to ApinT infection is accompanied by the sub-cellular re-localization 
of mitochondria in invaded hosts. Mitochondria of infected HeLa-S3 cells 
were stained using the MitoTracker Orange dye (Life Technologies) and 
nuclei with Hoechst (Invitrogen). The scale bar indicates 151m. The white 
arrowhead marks a prominent cluster of re-localized mitochondria. 


© 2016 Macmillan Publishers Limited. All rights reserved 


i?) 


—e WT 
—e= AryhB/AisrE 


hm 
nN 
i) 


—= AomrAlB 


[a] ryt an 5 1) | ayia 
ry 8 
a 3 5 0 
QD 7 
[ow we fom 34 3 
= ah 15 
[ee Blom 2 
[Se yin 0! - 2 0 5 
oe wens |yi8 re . 4 8 Rt 
5S rRNA cn Ks time p.i. [h] 
e 
Salmonella genes AomrAIB vs. WT AylAIB vs. WT 
AryhB/AisrE, AomrA/B 
IppA 
ripB 
osmE 
AyrlAIB 
rpsC 
[isk 
H rpmC 
host pathways Pay 
rpsH 
rpsB 
rplB 
rpID 
rpl32 
rpIP. 
rplC 
ipl. 
Ips 
6 
rpsD 
a2 
oe he or 
Le 
aL? 
time p.i. [h] 


4h 


16h 


common host pathways 


sRNA-specific host pathways 


affected by 2 mutants 


affected by only 1 mutant 


ARTICLE 


2A 


SL1344_3511 | hypothetical 


oxidoreductase 


lipoproteins 


ribosomal 
subunits 


sRNA mutant- vs. 


AryhB/AisrE WT-infected cells 
AomrA/B ms 
AyrlAIB ae 

AryhB/AisrE 03 
AomrAIB i. m 
OO) O77] | Wasa SSE, ™ 

AGT AR. , %y by yl >, % Gy Go Se Kb, » Py, yp Mn In, By % 
SE e See “any %, “oa tite ogo Mops Vi 176 Ay, 9g Ney “%, Gy 
Ee 0% On Sy, "8, ny, Shy ry log. GEV U9 85 Sy GH 
& LE Mp 96 ¢ Soy Ni, Mp py. Ue.) ty, "5g M6 Soy? Uy Hoy M9, Gop 

x L °¥®LS Wn, HO “fo, GD. S.? Gog, Ue ey, 

is) 73 a OG hae YG 
“erg G%, he, ap? lo ny. hee 2%, Pag Maho 
Ss O°? a Ds 2) 0 Sry, @r J 

w & 2 Y. ‘ay Ye 
r § “© HS 
% 


Extended Data Figure 9 | See next page for caption. 
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Extended Data Figure 9 | Comparative dual RNA-seq experiments 
with further sRNA mutant Salmonella. a, Northern blot detection of 
additional sRNAs induced upon host cell invasion (see Fig. 2a). ‘Inoculum’ 
refers to bacteria in DMEM. Gel source data in Supplementary Fig. 1. 

b, HeLa-S3 invasion efficiency of the three indicated sRNA double 
mutants and wild-type Salmonella (m.o.i. of 5). ¢, Intracellular replication 
kinetics of the same strains inside HeLa-S3 (m.o.i. of 5). Data in panels 

b and care derived from flow cytometry measurements as described for 
Extended Data Fig. 1a, b, and refer to the mean +s.d. from 3 biological 
replicates. d, Dual RNA-seq data of the infection of HeLa-S3 cells with the 
indicated Salmonella strains at 0h, 4h, and 16h p.i. The Venn diagram 
shows the number of significantly (adjusted P value < 0.05) differentially 
expressed Salmonella genes (combined for all three time points) between 
wild-type Salmonella and the respective sRNA double mutant strain. 

e, Heat map of bacterial mRNA and sRNA expression changes for 
AomrA/B, AryhB/AisrE and AyrlA/B as compared to wild-type 


Salmonella at 0h, 4h and 16h p.i. of HeLa-S3 cells. Plotted are all 

genes (except the respectively deleted sRNAs) that were significantly 
differentially expressed (adjusted P value < 0.05) for at least one of 

the indicated conditions. The gene for a putative oxidoreductase 
(SL1344_3511) was specifically upregulated in intracellular AryhB/AisrE 
mutants compared to wild-type Salmonella. Many genes for ribosomal 
subunits were downregulated in all three mutant strains compared to the 
wild-type. f, Venn diagram for the number of commonly or specifically 
affected human pathways between Salmonella wild-type infection and 
that of the three mutant strains (further specified in panel g). g, Infection 
of HeLa-S3 with three sRNA double mutant strains affects distinct sets of 
host pathways as compared to wild-type infection. Shown are normalized 
enrichment scores (norm. ES) from a human gene set enrichment analysis 
(adjusted P value < 0.05). The dual RNA-seq data in panels d—g were 
derived from 3 biological replicate experiments. 
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Formation of new stellar populations from gas 
accreted by massive young star clusters 


Chengyuan Li!*%, Richard de Grijs'*, Licai Deng’, Aaron M. Geller>®, Yu Xin’, Yi Hu? & Claude-André Faucher-Giguere® 


Stars in clusters are thought to form in a single burst from a 
common progenitor cloud of molecular gas. However, massive, old 
‘globular’ clusters—those with ages greater than ten billion years 
and masses several hundred thousand times that of the Sun—often 
harbour multiple stellar populations!‘, indicating that more than 
one star-forming event occurred during their lifetimes. Colliding 
stellar winds from late-stage, asymptotic-giant-branch stars”’ are 
often suggested to be triggers of second-generation star formation. 
For this to occur’, the initial cluster masses need to be greater 
than a few million solar masses. Here we report observations of 
three massive relatively young star clusters (1-2 billion years old) 
in the Magellanic Clouds that show clear evidence of burst-like 
star formation that occurred a few hundred million years after 
their initial formation era. We show that such clusters could have 
accreted sufficient gas to form new stars if they had orbited in their 
host galaxies’ gaseous disks throughout the period between their 
initial formation and the more recent bursts of star formation. This 
process may eventually give rise to the ubiquitous multiple stellar 
populations in globular clusters. 

The colour-magnitude diagrams of NGC 1783, NGC 1696 and NGC 
411 are shown in Fig. 1. These stellar distributions—the observational 
counterparts of the theoretical Hertzsprung—Russell diagrams, which 
relate the stellar surface temperatures to their luminosities—have been 
field-star-decontaminated by careful application of statistical back- 
ground-subtraction techniques (see Methods). Figure 1a shows that 
the majority of stars associated with NGC 1783 are well-described 
by an isochrone®—a theoretical ridge-line that describes stars with 
identical ages but covering a range of initial masses—characterized 
by an age ¢ given by logt = 9.15 (that is, t=1.4 x 10° yr; see Methods). 
However, one can also clearly discern two additional, bright stellar 
sequences, denoted ‘A’ and ‘B, with younger ages of, respectively, 
logt = 8.65 (t= 450 Myr) and logt = 8.95 (t = 890 Myr). These two 
sequences appear to have chemical compositions that are similar to 
the cluster’s bulk stellar population (particularly in terms of the abun- 
dances of helium and heavier elements), given the absence of any clear 
differences in the observed ridge-line colours. Sequence A may also 
include a subpopulation of stars associated with the cluster’s red clump, 
the (0.7-2)M5 (Mo, solar mass) analogues of the helium-burning 
horizontal-branch stars (indicated by the orange area). Figure 1b 
shows the colour-magnitude diagram pertaining to NGC 1696. It 
also exhibits a bright, young simple-stellar-population sequence 
characterized by logt = 8.70 (500 Myr), whereas the bulk of the NGC 
1696 stars are best represented by an older age of logt = 9.18 (1.5 Gyr). 
Compared with the cluster’s dominant main sequence, the younger 
sequence exhibits a colour offset of approximately —0.06 mag (a shift 
to bluer colours), which is consistent with a stellar population char- 
acterized by an enhanced helium abundance of Y=0.330 (33.0% of 
helium atoms by mass), while Y=0.256 for the bulk of the cluster 


stars. A similar result for NGC 411 is shown in Fig. 1c, where we also 
find an additional, brighter stellar sequence that is well-represented 
by a younger isochrone of logt = 8.50 (320 Myr) and Y=0.400, com- 
pared with logt=9.14 (1.4Gyr) and Y=0.252 for the bulk of the 
cluster’s stellar population. These well-populated, younger and helium- 
enhanced stellar sequences represent the strongest evidence yet that 
additional, post-formation starburst events may have occurred in our 
sample clusters. The enhanced helium abundances may also lead to 
small changes in the best-fitting cluster ages, but there are currently no 
appropriate model isochrones available to accurately explore this effect 
for stellar populations younger than 1 Gyr. Nevertheless, the general 
sense of helium-enhanced young sequences shown here is robust. To 
date, no other massive clusters of equivalent age are known to host 
similarly significant populations of younger stars’’. 

We explored whether ‘blue straggler stars —stars that have been 
rejuvenated through either stellar collisions or mass transfer in binary 
stellar systems''!’—could be entirely responsible for the presence of 
these younger sequences. If they are formed through mass transfer in 
unresolved, compact binary systems, they would appear brighter and 
slightly redder than the corresponding isochrone’ describing zero-age 
single stars (stars whose output luminosities are no longer powered 
by the excess energy gained from gravitational contraction but which 
are instead driven by nuclear fusion of hydrogen atoms). However, the 
younger sequences in both clusters are too blue to account for a binary 
origin and can instead be very well described by single-star isochrones. 
Given their young ages and the timescales involved in evolution 
through binary mass transfer, if any of these younger stars are indeed 
blue stragglers, they will most probably have formed through stellar 
collisions. Collisionally formed blue stragglers, like secondary stellar 
generations originating from colliding stellar winds, are expected to be 
more centrally concentrated than the clusters’ dominant (by number) 
stellar populations®. Figure 2 compares the normalized radial distri- 
butions of the young sequences with those of ‘normal cluster stars of 
similar luminosity. The stars in the young sequences are markedly less 
centrally concentrated than the dominant older population of cluster 
members. 

The more extended nature of the young populations suggests that 
they may have an external origin. Indeed, since the masses of NGC 
1783, NGC 1696 and NGC 411 are only 1.8 x 10° Mo, 5.0 x 104 Mz 
and 3.2 x 104 Mo, respectively (refs 14, 15), they are insufficiently 
massive to efficiently capture the stellar winds from asymptotic- 
giant-branch stars'®. Note that our NGC 1696 mass is based on extra- 
polation of the observed stellar luminosity function down to a stellar 
mass of 0.08Mq (the minimum mass for hydrogen fusion, somewhat 
depending on the star’s chemical composition), adopting a Kroupa- 
like initial mass distribution'’. In addition, the observed upper limits 
in the mass—age diagrams populated by star clusters in the Magellanic 
Clouds are well-understood in terms of ‘size-of-sample’ effects. They 


1Kavli Institute for Astronomy and Astrophysics and Department of Astronomy, Peking University, Yi He Yuan Lu 5, Hai Dian District, Beijing 100871, China. 2Key Laboratory for Optical Astronomy, 
National Astronomical Observatories, Chinese Academy of Sciences, 20A Datun Road, Chaoyang District, Beijing 100012, China. ?Purple Mountain Observatory, Chinese Academy of Sciences, 

2 West Beijing Road, Nanjing 210008, China. “International Space Science Institute - Beijing, 1 Nanertiao, Hai Dian District, Beijing 100190, China. °Center for Interdisciplinary Exploration and 
Research in Astrophysics (CIERA) and Department of Physics and Astronomy, Northwestern University, 2145 Sheridan Road, Evanston, Illinois 60208, USA. "Department of Astronomy and 


Astrophysics, University of Chicago, 5640 South Ellis Avenue, Chicago, Illinois 60637, USA. 


502 | NATURE | VOL 529 | 28 JANUARY 2016 


© 2016 Macmillan Publishers Limited. All rights reserved 


B (mag) 


LETTER 


B (mag) 


B-I (mag) 
Figure 1 | Colour-magnitude diagrams, including the best-fitting 
isochrones, and true-colour images for all three clusters. a, NGC 
1783. Central panel, purple and dark green squares show sequence A 
and sequence B stars, respectively; red solid line, blue solid line and blue 
dashed line show isochrones for logt = 9.15, 8.95 and 8.65, respectively. 
The orange box indicates the region where red-clump stars associated with 
sequence A may be found. Left panel, representative +10 measurement 
uncertainties, note that the x-axis scale is different from that of the main 
panel. Right panel, true-colour image. b, NGC 1696. Left panel, purple 


cannot be reconciled with initial cluster populations containing large 
numbers of young clusters with masses far greater than 10°Mo. 

Adopting a Kroupa-like initial stellar mass distribution’”, we esti- 
mate that the total stellar masses in the younger sequences, down 
to the hydrogen-burning limit, are 372M. and 250M. (NGC 1783 
sequences A and B, respectively), 527M. and 560M. (NGC 1696 and 
NGC 411, respectively). Compared with the total cluster masses, the 
young sequences represent mere 0.2%-2.0% mass fractions. The age 
differences between the NGC 1783 stars in sequences A and B, and 
those on the main sequence are 440 Myr and 520 Myr, respectively. 
For NGC 1696 and NGC 411, the age differences between the clusters’ 
young and main sequences are 1.02 Gyr and 1.06 Gyr, respectively. 

The Large and Small Magellanic Clouds, the host galaxies of our sam- 
ple clusters, contain numerous, densely distributed giant molecular- 
gas clouds’*!”. Massive clusters like those targeted here quench their 
initial star-formation activity on timescales of a few tens of millions of 
years”®”! owing to the occurrence of type II supernovae resulting from 
the deaths of the most massive first-generation stars”””*. This leaves 
a cluster embedded in a gas-poor ‘cavity. As the young star cluster 
moves through the interstellar medium, it could potentially accrete 
sufficient gas to fuel renewed star formation. In theory, secondary 
star formation can be triggered’? and proceed rapidly once the gas 
density reaches the relevant threshold, for sufficiently low tempera- 
tures. This would result in the appearance of a younger ‘simple stellar 
population. However, to date the reality of this proposed idea has not 
been confirmed. 

Although they are all contained within twice their host clusters’ core 
radii, the observed spatial distributions of all young stellar sequences 
in our sample clusters are more extended than the cluster’s bulk stellar 
populations. This could indicate that these clusters may have accreted 
ambient gas, allowing star formation to proceed. We thus explored 
the expected gas-accretion rate as a function of the local gas den- 
sity. Indeed, it appears possible for NGC 1783-like clusters to accrete 
enough gas to form new stars”* (see Methods). 


squares, younger sequence; red solid and blue dashed lines, isochrones 

for logt = 9.18 and 8.70, respectively. Right panel, true colour image. 

c, NGC 411. Left panel, purple squares, younger sequence; red solid and 
blue dashed lines, isochrones for logt = 9.14 and 8.50, respectively. Right 
panel, true colour image synthesised from B, V and I band monochromatic 
images. The measurement uncertainties in b and ¢ are equivalent to 

those shown for a. For the young sequences in NGC 1696 and NGC 411, 
enhanced helium abundances have been adopted (see text). 


Almost all Galactic globular clusters host multiple stellar popula- 
tions. However, it is still unclear whether these latter populations orig- 
inate from young clusters that formed as single-age stellar populations. 
Our observations of secondary stellar populations in intermediate-age 
Magellanic Cloud clusters suggest that the same process giving rise to 
them may also explain the multiple stellar populations seen in at least 
some Galactic globular clusters. We aim at addressing this issue in a 
follow-up study. 

Many star clusters in the Magellanic Clouds contain large numbers 
of stars occupying the colour-luminosity parameter space at bluer 
colours and brighter luminosities than their main-sequence turn-off 
regions”°?7, These clusters include NGC 121, NGC 1652, NGC 1751, 
NGC 1795, NGC 1806, NGC 1846, NGC 1852, NGC 1917, NGC 1978, 
NGC 2121, NGC 2154, NGC 339, NGC 416, NGC 419, Hodge 7, Kron 3, 
Lindsay 1 and Lindsay 38. These brighter and bluer stars are usually 
dismissed as residual field-star contamination. However, if well- 
populated younger sequences are, in fact, embedded in this param- 
eter space, the presence of such simple stellar populations indicates 
that many clusters may have experienced starburst events some time 
after their initial formation epoch. Among the clusters highlighted 
here, some—including NGC 1806, NGC 1846, Lindsay 38 and NGC 
419—appear to exhibit features similar to NGC 1783, NGC 1696 and 
NGC 411, although at a lower level of significance. Particularly for 
Lindsay 38 and NGC 419, the radial distributions of the bright stars 
beyond their main-sequence turn-off regions are known to be less 
concentrated than the bulk cluster stars with similar luminosities”®, 
and hence a population of blue stragglers probably cannot explain 
the properties of all those stars. Some may have originated from gas 
accretion. These clusters will be targeted in our follow-up studies. 
Our discovery of clear, well-populated young sequences in NGC 1783, 
NGC 1696 and NGC 411 has revealed that star clusters may indeed 
have the capacity to accrete gas from their environment. This could, 
in fact, be the most important route to form secondary stellar popu- 
lations in young massive clusters. 
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Figure 2 | Normalized radial distributions of the young sequences with 
respect to normal cluster stars of similar luminosity. N(R) represents 
the number of stars in the young sequences within radius R, whereas 
Nrcep+rc(R) represents the total number of red-giant branch and red 
clump stars within R. RGB, red-giant branch; RC, red clump. a, NGC 1783; 
b, NGC 1696; and c, NGC 411. The error bars reflect Poissonian lo 
uncertainties. 
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METHODS 

Observational data. NGC 1783. The NGC 1783 observations were obtained as 
part of Hubble Space Telescope (HST) general-observer programme GO-10595 
(principal investigator, P. Goudfrooij), using the Advanced Camera for Surveys/ 
Wide Field Channel (ACS/WFC). The cluster was observed through the F435W 
and F814W filters (with central wavelengths of 435 nm and 814nm, respectively), 
which correspond approximately to the Johnson-Cousins B and I bands, respec- 
tively, and which will be referred to as such henceforth. Short-exposure images 
were observed for 90s in the B band and 8s in the I band, while long-exposure 
images were observed for 680s in both bands. 

Because NGC 1783 has an extended core (see below), the cluster region occupies 
almost the entire image. Therefore, we obtained an additional set of observations 
towards the southeast of NGC 1783 as a representative field region (HST pro- 
gramme GO-12257; principal investigator, L. Girardi). Its centre is located at a 
distance of more than 300 arcsec from the cluster centre, so that it is unlikely to be 
significantly affected by tidally stripped cluster stars. The data sets pertaining to 
the field were also obtained with the ACS/WEC in the F435W and F814W filters. 
Their total exposure times are 700s and 720s for the B and I bands, respectively. 
These exposure times are sufficient to resolve the young sequences. 

NGC 1696. The NGC 1696 data sets were also obtained as part of HST programme 
GO-10595, using the ACS/WEC. The cluster was observed through the F435W 
and F814W filters. The images of NGC 1696 are also composed of short- and 
long-exposure frames. The long (short) exposure times in the B and I bands were 
680s (90) s and 680s (8) s, respectively. Since NGC 1696 is not as extended as NGC 
1783 (see below), a representative field region was selected close to the edge of the 
NGC 1696 images. Few studies to date have targeted NGC 1696. Therefore, we 
derived the physical parameters of NGC 1696 ourselves. 

NGC 411. The data sets of NGC 411 were obtained as part of HST programme 
GO-12257, using the Wide Field Camera-3 (WFC3). The cluster was observed 
through the F475W and F814W filters. The F475W filter is centred at a wavelength 
of 475 nm; its transmission curve also corresponds approximately to that of the 
Johnson B band. The exposure times in both the B and I bands were 700s. NGC 411 
has a relatively small core (see below), which allowed us to select a representative 
field region close to the edge of our science images. 

Photometry and data reduction. We used two independent software packages to 
perform point-spread-function (PSF) photometry, including DAOPHOT” within 
the IRAF environment and DOLPHOT**. Our stellar catalogues are based on 
the DAOPHOT results. We performed the DOLPHOT analysis for comparison, 
to ensure that our final photometry is not biased. 

The DAOPHOT-generated raw stellar catalogue contains a sharpness parameter, 
which describes the goodness of the PSF fit”. For a ‘good’ star, the sharpness should 
be close to 0. We thus constrained our sample to stars with a sharpness between 
—0.5 and 0.5, which removed approximately 4% of objects from our catalogue. 
We carefully checked the resulting colour-magnitude diagram and found that 
the main features of interest were not affected by this selection. For NGC 1783 
and NGC 1696, we merged the stellar samples resulting from the short and long 
exposure times, carefully cross-referencing both catalogues to avoid duplication of 
objects in the combined output catalogue. For NGC 1783, the short-exposure-time 
catalogue contributes very little to sequences A and B. For NGC 1696, the short- 
exposure-time stars contribute only marginally to the feature of interest. 
Determination of the cluster and field regions. We divided the stellar spatial 
distribution into 15-20 bins along both the right ascension (ay2999) and decli- 
nation (4j2990) axes. We varied the bin numbers to ensure that statistical scatter 
would not significantly affect the shape of the number-density distributions. We 
used a Gaussian function to fit the latter along both axes and defined the closest 
positional coincidence of both Gaussian peaks as the cluster centre. The NGC 1783 
cluster centre is located at ay2999 = 04h 59 min 08.47 s, 5j29909 = —65° 59! 17.81”. For 
NGC 1696, the centre coordinates are ©2999 = 05h 02 min 11.16 s, 6)2009 = —67° 59’ 
07.66", while for NGC 41 1, 2000 = 01h 07 min 56.22 S, 632000 =-—7I1° 46! 04.40". 
Our NGC 1783 and NGC 411 cluster centres are very close to the cluster centres 
determined previously*?*4. We used a Monte Carlo method to examine the spatial 
distribution of the NGC 1696 stars and estimate the areas of annuli at different 
radii. The stellar number density in each annulus is N(R)/A(R), where N(R) is 
the number of stars observed in an annulus with radius R, and A(R) is the area of 
the annulus. We defined the clusters’ (2D-projected) core radii as those radii 
where the density profiles drop to half the respective central densities. We selected 
the areas contained within 2 core radii as the cluster regions. NGC 1783 has a large 
core radius (45-50 arcsec, which is identical to that adopted by ref. 35, although 
our core radius is slightly larger than their value of 36.7 arcsec. We compared our 
catalogue with theirs and found that our photometry is deeper and, hence, contains 
more faint stars). The core radius of NGC 1696 is smaller (~30 arcsec), while the 
core radius of NGC 411 is approximately half that of NGC 1783 (20-25 arcsec). 


LETTER 


Hence, the cluster radii we adopted for NGC 1783, NGC 1696 and NGC 411 were 
100 arcsec, 60 arcsec and 50 arcsec, respectively. We selected these radii as cluster 
radii, because of the need to avoid background contamination as much as possible, 
while simultaneously ensuring statistically robust results. In Extended Data Fig. 1, 
we present the stellar number-density profiles. 

For NGC 1783, we synthesized the field region’s colour-magnitude diagram 
based on a combination of our observations of the cluster’s periphery on the image 
containing the cluster (“field 1’) and the field region towards the southeast (‘field 
2’). Because the features of interest are very bright, we only consider stars with 
B<23 mag. For these stars, NGC 1783 is compact (its core radius is 25 arcsec) and 
field 1 indeed adequately represents the background. We found that the eastern 
part of field 2 exhibits a clear, slightly brighter stellar sequence parallel to the main 
sequence, indicating a significant population of unresolved binary systems. Since 
the field’s stellar number density is low, blending of unrelated stars along the line 
of sight is negligible; instead, this binary population may reflect contamination 
by a nearby star-forming region. We hence selected the eastern part of field 2 
as a representative field region. We selected three rectangles, each covering an 
area of 1,000 pixels x 1,000 pixels (~50 arcsec x 50 arcsec), near the edge of field 
1, as well as four rectangles from the western part of field 2 (covering the same 
area), to construct a complete, combined field region. To assign equal weights to 
the stellar catalogues from both regions, we randomly selected three-quarters of 
the full sample of stars detected in the four rectangles of field 2’s stellar catalogue. 
The combined stellar catalogue based on these seven rectangles represents the syn- 
thesized field region’s colour-magnitude diagram. The cluster region is roughly 1.6 
times larger than the field region (see the left-hand panel of Extended Data Fig. 2). 

For NGC 1696, we selected an area of 600 pixels x 4,000 pixels 
(~30 arcsec x 200 arcsec) near the edge of the image as a representative field region. 
The NGC 1696 cluster region is roughly 1.7 times larger than the field region. The 
selected cluster and field regions pertaining to NGC 1696 are shown in the middle 
panel of Extended Data Fig. 2. 

For NGC 411, we selected a field region covering an area of 800 pixels x 3,500 
pixels (~32 arcsec x 140 arcsec) from the cluster’s periphery. We avoided regions 
that were located close to the edge of the image, where the photometric quality is 
markedly inferior, probably owing to the relatively large offsets between the indi- 
vidual science images, combined with instrumental propagation effects. Although 
our selection may include some tidally stripped cluster stars, this does not affect the 
magnitude range of interest, which is bright. The NGC 411 cluster region is roughly 
1.8 times larger than the field region. The selected cluster and field regions of NGC 
411 are shown in the right-hand panel of Extended Data Fig. 2. 

Reducing background contamination and isochrone fitting. Once we had 
obtained the cluster and background colour-magnitude diagrams, we generated 
a common magnitude x colour grid with cell sizes of 0.5 mag x 0.25 mag, spanning 
the ranges from (B— I) = —2.5 mag to 3.5 mag and from B= 16 mag to 27 mag. 
This range is sufficiently large to cover the full colour-magnitude diagrams of our 
target clusters. The cell size is relatively large for the main sequences, because it 
was specifically designed to be practically useful in the regions occupied by the 
young, blue sequences, where the stellar number density is lower. We counted 
the number of background stars in each cell and calculated the number of possible 
contaminating stars in the same cell (corrected for differences in areas covered), 
which we then randomly removed. We confirmed that varying the cell sizes from 
0.3 mag Xx 0.15 mag to 0.5 mag x 0.25 mag for NGC 1783 would not affect the sig- 
nificance of any of the features of interest. For NGC 1696, the typical practically 
useful cell sizes range from 0.4mag x 0.4mag to 0.5 mag x 0.5 mag, while for NGC 
411, viable cell sizes range from roughly 0.3 mag x 0.15 mag to 0.5 mag x 0.25 mag. 
Adopting much larger or smaller cell sizes would erase the observed sequences. We 
carefully examined the performance of our decontamination method and found 
that the observed features do not depend on the cluster or field regions selected: see 
Extended Data Figs 3 and 4, where we use NGC 1783 as benchmark. In Extended 
Data Fig. 3, we show the field-decontaminated colour-magnitude diagrams per- 
taining to three different samples of cluster members, at radii R > 30”, R> 60” 
and R> 90". They all exhibit distinct younger sequences. In Extended Data Fig. 4, 
we select representative field regions from different images (two from a separate 
image and one from the image which also contains the cluster itself). The observed 
young sequences remain clearly visible for all three background regions adopted. 
In Extended Data Fig. 5, we show the decontaminated colour-magnitude diagrams 
resulting from adoption of different grid sizes. This figure shows that the observed 
features are almost independent of grid size. Extensive tests also showed that the 
sequences found in the NGC 1696 and NGC 419 colour-magnitude diagrams 
are similarly well-defined. This confirms that the observed sequences are phys- 
ically real rather than caused by statistical sampling effects. We also found that, 
for the adopted cell sizes, the reduced colour-magnitude distributions are similar 
in appearance to the real background colour-magnitude diagrams. Our adopted 
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method therefore performs adequately. From Extended Data Fig. 6a, e and i, one 
can deduce that these bright sequences are indeed already embedded in the unre- 
duced colour-magnitude diagrams. 

We obtained best fits to all observed sequences, including the clusters’ main 
sequences, based on matching the observations with theoretical stellar isochrones’. 
Similarly to other intermediate-age star clusters, NGC 1783, NGC 1696 and NGC 
411 display extended main-sequence turn-off regions”®°, which renders deter- 
mination of their main-sequence ages difficult. However, it has been reported that 
ages of such intermediate-age star clusters can be constrained by consideration 
of their tight subgiant branches*”** (but see ref. 39 for an opposing view), which 
represent the stellar evolutionary stage stars enter once they have exhausted the 
hydrogen in their cores through nuclear fusion. Indeed, all of our sample clusters 
exhibit tight subgiant branches. The final age determination yields logt= 9.15, 9.17 
and 9.14 for NGC 1783, NGC 1696 and NGC 411, respectively. 

We next determined the ages of each of the blue sequences. For NGC 1783, 
sequences A and B are characterized by ages of logt = 8.65 and 8.95, respectively. 
The NGC 1783 main-sequence stars, as well as those in sequences A and B, 
share the same metal abundance, Z= 0.008 (40% of solar metallicity), and visual 
extinction, Ay = 0.06 mag. We adopted a true distance modulus for NGC 1783 
of (m— M) = 18.46 mag (corresponding to a distance of 49.2 kpc). The young 
sequence in NGC 1696 is adequately characterized by a logt= 8.70 isochrone with 
the same metallicity as the cluster’s main sequence, Z= 0.004 (20% solar). However, 
the young sequence appears 0.06 mag bluer than the zero-age main sequence of the 
cluster’s bulk stellar population. An enhanced helium abundance (Y = 0.330) can 
explain the colour offset", where Y= 0.256 for the cluster’s main sequence. The 
adopted extinction and distance modulus for NGC 1696 are Ay = 0.10 mag and 
(m— M)o= 18.50 mag (50.1 kpc), respectively. The young sequence in NGC 411 
has an age of logt= 8.50. It has the same metallicity as the cluster’s main sequence, 
Z=0.002. Its young sequence is also very blue, which can again only be explained 
if it is characterized by an enhanced helium abundance of Y=0.400 compared with 
Y=0.252 for the cluster’s main sequence. Foreground extinction of Ay = 0.25 mag 
is appropriate and our adopted true distance modulus is (m — M)o = 18.90 mag 
(60.3 kpc). 

Blue straggler stars as possible origins of the younger sequences. One possible 
explanation for these bright sequences is that they are composed of blue straggler 
stars. Therefore, we investigated their relative radial concentration with respect to 
stars that have similar luminosities. In NGC 1783, NGC 1696 and NGC 411, the lat- 
ter stars are mostly red-giant-branch and red-clump stars: see Extended Data Fig. 7. 

However, as shown in Fig. 2, the stars defining the bright sequences are all less 
centrally concentrated than red-giant stars with similar luminosities. If they are 
genuine blue stragglers, irrespective of their origin, they are expected to be more 
centrally concentrated than similar-luminosity red-giant stars, because blue strag- 
glers are expected to be more massive than red giants. Dynamical interactions are 
also unlikely to have redistributed all blue stragglers to the outskirts of our sample 
clusters, since the typical dynamical timescales*! are much longer than the clusters’ 
current ages. In addition, the flat cluster cores observed for NGC 1783, NGC 1696 
and NGC 411 render the probability of core-collapse events having occurred very 
unlikely. Core collapse would produce a cuspy radial density profile*? and such 
a process is a prerequisite for the presence of a coeval collisional blue straggler 
population. The more extended radial distributions of the stars in the younger 
sequences compared with the dominant cluster populations argue against stellar 
collisions in the cluster cores having played an important role. 

Gas accretion from the interstellar medium. We followed the method of ref. 23 to 
estimate the regions of parameter space where an NGC 1783-like star cluster with 
amass of 1.8 x 10° Mj and a half-mass radius of 11.4 pc (ref. 43) could accrete the 
required mass to form two additional generations of stars: one containing 250M. 
in stars some 520 Myr after the initial star-formation event, and a subsequent gen- 
eration composed of 370M. of stars 440 Myr later. The key equations in ref. 23 
are equations (3), (5) and (8). For all calculations we assumed a star-formation 
efficiency of 10% (which implies that only 10% of the available gas is converted 
into stars). We explored two processes through which a cluster can accrete gas from 


the interstellar medium2’, one due to gravity (‘Bondi accretion’), and one due to 
sweeping up material as the cluster orbits around its host galaxy’s centre. The latter 
process involves some initial intracluster gas which interacts collisionally with the 
interstellar gas. The accretion rates from both of these processes depend on both 
the gas density, n, and the relative velocity of cluster and gas, V. Ram-pressure 
stripping can limit the accretion, and this process also depends on n and V. 

Extended Data Fig. 8 shows the allowed (shaded) regions of parameter space in 
(n, V) that would enable a cluster to accrete the desired amount of gas in the period 
between star-formation events. This figure assumes accretion from a volume-filling 
interstellar medium over the entire duration between respective generations (Bondi 
accretion). The Bondi line is curved, because this accretion rate also depends on 
the gas sound speed, which we have assumed to be 10 km s7! (that is, gas ata 
temperature of ~10‘K). Anything to the upper right of the ‘Ram line will strip 
the gas from the cluster. Below that line, and for a given density, the Bondi line 
defines the upper limit to the cluster’s bulk velocity that would be allowed for the 
cluster to accrete this amount of gas. In other words at a velocity below the curve, 
the cluster would accrete gas more quickly, and, for instance, if the star-formation 
efficiency were less than 10%, it could still form the same mass of stars over the 
same period. Conversely, at a given density, sweeping up of gas is more efficient at 
larger velocities, so the dashed line shows the lower limit to the velocity that would 
be required to reach the desired amount of gas. 

Extended Data Fig. 9 shows two examples of how the accreted mass could 
accumulate over time for the Bondi regime (solid lines) and the sweeping regime 
(dashed lines). For the Bondi regime, we used a relative velocity of 4kms"! anda 
density of 0.3cm”°. For the sweeping regime, we used a velocity of 50 kms"! and 
a density of 0.05cm~ (which is similar to the rotation velocity at the position 
of NGC 1783 with respect to the Large Magellanic Cloud’s centre**). The data 
points show these masses, with 180M. uncertainties divided by the star-formation 
efficiency of 10%. 

Sample size. No statistical methods were used to predetermine sample size. 
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Extended Data Figure 1 | Stellar number-density profiles. Here p is the number density of stars at a given radius R. a, NGC 1783. The vertical solid line 
indicates the cluster’s core radius, that is, the position where the number density decreases to half the central value. The +10 uncertainties shown are due 
to Poisson noise. b, c, As a, but for NGC 1696 (b) and NGC 411 (c). 
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Extended Data Figure 2 | Spatial distributions of cluster and field stars. Red and blue points represent cluster stars and the adopted field stars, 


respectively. The black dots are all observed stars except for the cluster and field stars. RA, right ascension; dec., declination. a, NGC 1783; b, NGC 1696; 


and c, NGC 411. 
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Extended Data Figure 3 | Field-star-decontaminated colour-magnitude diagrams for samples of stars at different radii in NGC 1783. a, R > 30"; 
b, R> 60"; and c, R>90". 
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Extended Data Figure 4 | Field-star-decontaminated colour-magnitude _ but for a representative field region taken from a separate image. e, f, As c 


diagrams of NGC 1783 for three different adopted reference fields. and d, but for a different field region taken from the same, separate image. 
a, b, Resulting colour-magnitude diagram (a) based on the field-star The black points are stars which are located in the cluster region, whereas 
sample drawn from the image containing the cluster (b). c, d, As a and b, the red points are the adopted field stars. 
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Extended Data Figure 5 | Field-star-decontaminated colour-magnitude diagrams of NGC 1783 for three different grid sizes. a, 0.30 mag x 0.15 mag; 


b, 0.40 mag x 0.20 mag; and ¢, 0.50 mag x 0.25 mag. 
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Extended Data Figure 6 | Colour-magnitude analysis. a~d, NGC colour-magnitude diagrams of the representative field regions; and fourth 
1783; e-h, NGC 1696; i-], NGC411. First column (a, e, i), raw colour- column (d, h, 1), stellar colour-magnitude distributions that were removed 
magnitude diagrams; second column (b, f, j), field-decontaminated from the raw catalogues. 


colour-magnitude diagrams (also shown in Fig. 1); third column (c, g, k), 
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Extended Data Figure 7 | Colour-magnitude diagrams highlighting 
specific features. a, Purple and dark green squares, stars in NGC 1783 
sequences A and B, respectively. Dark green circles, corresponding red- 
giant-branch and red-clump stars, used for comparison with sequence B. 
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The combination of dark green and purple circles represents the sample 
used for comparison with sequence A. b, Purple squares, NGC 1696 
young-sequence stars; red circles, corresponding red-giant-branch and 
red-clump stars used for comparison. c, As b, but for NGC 411. 
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Extended Data Figure 8 | Gas-accretion diagnostic diagram. V is the 
relative velocity of the cluster with respect to the gas, and n represents 

the gas density. The shaded regions indicate the parameter space where 
an NGC 1783-like cluster can accrete the required mass to form the two 
additional generations of stars, namely one of 250M, over 520 Myr (blue, 
corresponding to sequence B in Fig. 1), and a second of 370Mz over 

440 Myr (green, corresponding to sequence A in Fig. 1). We have assumed 
a star-formation efficiency of 10% for all calculations in this figure. The 
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regions to the right of the ‘Gravity’ curves correspond to where Bondi 
accretion can accumulate at least the required mass, and the regions to the 
right of the ‘Sweep’ curves correspond to where accretion by collisional 
sweeping up of ambient interstellar gas by seed intracluster gas can 
accumulate at least the required mass. The parameter space above the 
‘Ram curves are excluded because ram pressure strips clusters of their gas 
in those regions. See Methods for details. 
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Extended Data Figure 9 | Gas mass, M, accreted from the interstellar and age offsets of NGC1783 sequences A and B, respectively. The vertical 
medium as a function of time, t. We have adopted a star-formation extent of each region provides an estimate of the range in allowed masses. 
efficiency of 10% and calculated representative interstellar gas-accretion Specifically, for each sequence we plot a region centred on the mass 
frameworks that can explain the stellar masses in the secondary sequences _ derived using the Kroupa initial mass function, as given in the main 

A and B in NGC 1783. For the Bondi regime (solid lines), we used a text; the range to higher and lower masses is equal to the mean of the 
relative velocity of 4kms"' and a density of 0.3cm™. For the sweeping differences between the total masses derived using a Salpeter and Kroupa 
regime (dashed lines), we used a velocity of 50kms~' and a density of initial mass function of 180M. (multiplied by the assumed 10% star 
0.05cm™. The blue and green filled regions indicate the stellar masses formation efficiency). 
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Measurement noise 100 times lower than the 
quantum - projection limit using entangled atoms 


Onur Hosten!, Nils J. Engelsen!, Rajiv Krishnakumar! & Mark A. Kasevich! 


Quantum metrology uses quantum entanglement—correlations in 
the properties of microscopic systems—to improve the statistical 
precision of physical measurements’. When measuring a signal, such 
as the phase shift of a light beam or an atomic state, a prominent 
limitation to achievable precision arises from the noise associated 
with the counting of uncorrelated probe particles. This noise, 
commonly referred to as shot noise or projection noise, gives rise 
to the standard quantum limit (SQL) to phase resolution. However, 
it can be mitigated down to the fundamental Heisenberg limit by 
entangling the probe particles. Despite considerable experimental 
progress in a variety of physical systems, a question that persists is 
whether these methods can achieve performance levels that compare 
favourably with optimized conventional (non-entangled) systems. 
Here we demonstrate an approach that achieves unprecedented 
levels of metrological improvement using half a million *’Rb atoms 
in their ‘clock states. The ensemble is 20.1 + 0.3 decibels (100-fold) 
spin-squeezed via an optical-cavity-based measurement. We directly 
resolve small microwave-induced rotations 18.5 + 0.3 decibels 
(70-fold) beyond the SQL. The single-shot phase resolution of 
147 microradians achieved by the apparatus is better than that 
achieved by the best engineered cold atom sensors despite lower 
atom numbers”?. We infer entanglement of more than 680 + 35 
particles in the atomic ensemble. Applications include atomic 
clocks‘, inertial sensors’, and fundamental physics experiments 
such as tests of general relativity® or searches for electron electric 
dipole moment’. To this end, we demonstrate an atomic clock 
measurement with a quantum enhancement of 10.5 + 0.3 decibels 
(11-fold), limited by the phase noise of our microwave source. 

Quantum noise arises from the impossibility of simultaneously 
measuring conjugate physical observables. For example, for a har- 
monic oscillator, capturing the basic physics of light or superconduct- 
ing circuits, the conjugate observables would be position and 
momentum. These observables exhibit finite uncertainties even in the 
lowest possible energy state. For an ensemble of N two-level atoms, 
a convenient mapping that captures the physics is the pseudo-spin 
system, where each atom is a spin-half system and the ensemble 
constitutes a spin-J system (J = N/2). The conjugate observables 
are the Cartesian spin components J,.,,, with the uncertainty relation 
AJ, AJ, > J, /2|, and can be pictorially represented on a Bloch 
sphere (see below). For an unentangled ensemble in a coherent spin 
state (CSS) with (Jy) = (Jz) =0 the uncertainties are given by 
AJ, = AJ, = VN/2. This quantity is the projection noise of a CSS 
(CSS noise). 

‘Squeezing’ serves to redistribute the noise to make the observable 
of interest quieter than the CSS noise level, while still conforming to 
uncertainty relations. This process is non-classical as it introduces 
quantum entanglement into the system. The metrological improve- 


IN /2 Vel \P 
ment provided by squeezing is quantified by® x= ( ve ; ut) ; 


the first factor represents noise reduction, whereas the second repre- 
sents coherence loss. This quantity translates directly into reduction 


in resources needed to perform a specific measurement (20 dB 
(x? = 100) is equivalent to a 100-fold increase in atom number or 
reduction in averaging time). 

In optical settings'® and superconducting microwave cir- 
cuits!!, squeezing in excess of 12 dB has been demonstrated. 
Implementations in the interferometers of the GEO-600 and LIGO 
gravitational wave detectors achieved 2.5 dB improvements’’. 
Other demonstrations include non-invasive biological imaging!*. 
Spin-squeezing has been shown in cold atomic ensembles with 
methods based on both interaction'*"'® and measurement'7~!’. 
States squeezed by 5.6 dB (ref. 20) were used to obtain atomic-clock 
improvements up to 4.5 dB (ref. 21), and magnetometer enhance- 
ments up to 3.4dB were observed in other experiments””*. A 
metrological improvement of 10 dB was attained in a cavity-based 
experiment! taking advantage of cycling transitions in *’Rb. 
However, as it utilizes magnetically sensitive states, and atoms are 
non-uniformly coupled to the cavity, this last approach is not suitable 
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Figure 1 | Overall setup. a, Uniform atom-probe coupling: atoms are 
trapped at the maxima of the probe intensity profile by the 1,560 nm lattice 
(left). The 780 nm probe light is detuned by equal and opposite amounts 
from the two clock states (right). b, Probe light is generated by frequency- 
doubling the 1,560 nm light whose frequency is stabilized to match that 
of the main cavity resonance via feedback control. No further frequency 
stabilization of the probe is required, eliminating residual 780 nm light 
inside the cavity. AOM, acousto-optic modulator; PBS, polarizing beam- 
splitter. c, Homodyne detection system and the form of the output signal. 
Path ‘A’ contains two path length stabilizing side-bands, in addition to the 
probe frequency, also present on path ‘B’ See Methods for details of a—c. 
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Figure 2 | Squeezing and metrology. a, Projection noise of unentangled 
spins. Top: pictorial representation of a CSS on Bloch sphere (left) 

and a sample J, distribution (right). Bottom: characterization of CSS 
noise. Dot-dashed curve, the expected noise; solid line, fit revealing the 
underlying microwave rotation noise (dot-dashed straight line); dashed 
line, resolution of the 1-strength measurement (subtracted in quadrature 
from the data). Error bars, 68% statistical confidence interval. b, Top: 
pictorial representation of two squeezed spin states, one rotated in the 


for atomic clocks and other precision sensors requiring the release of 
trapped atoms from their locations in the cavity. 

Owing to systematic errors arising from collisions between atoms, 
there is typically an upper bound to the number of atoms that can be 
employed in state-of-the-art cold atom sensors” *. Squeezing offers 
a universal path to surpassing this limitation in sensitivity. However, 
methods demonstrated thus far have fallen significantly short of 
achieving competitive single-shot phase readout sensitivities. 

Here we present a quantum metrology implementation using the 
magnetically insensitive clock states (|}) and||)) of *’Rb (refs 19-21). 
We prepare the squeezed states through a collective population differ- 
ence measurement on the atoms. In the spin language, we make a J, 
measurement that projects the quantum state into one with a narrower 
distribution of J, than that of a CSS. The measurement is enabled by a 
high-finesse optical cavity: the | |) (|{)) atoms increase (decrease) the 
index of refraction seen by the probe light. Therefore, the frequency 
shift of the cavity resonance is a direct predictor of J,. 

The J, measurement resolution is determined by the competition 
between photon shot noise and probe induced Raman scattering (spin- 
flips). The former limits the precision of the cavity frequency measure- 
ments; the latter leads to a random walk in the measured observable. 
For the °’Rb clock states we expect an optimal enhancement 
of Xa NCI faa? aula (see Methods and ref. 25). Here, C is the co- 
operativity (photon-scattering-rate ratio into cavity mode versus free- 
space), I/wyp is the ratio between the excited state linewidth and the 
hyperfine splitting, and ¢ is the overall probe detection efficiency. The 
enhancement saturates with atom number owing to the impact of atomic 
absorption on measurement sensitivity, in addition to the spin-flips. For 
our parameters (¢ = 0.16), the upper bound on achievable squeezing is 
about 24 dB. 

We operate in a configuration where the probe light is uniformly 
coupled to all the atoms that are confined in a one-dimensional optical 
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direction of the white arrow by a weak microwave pulse by 660 1rad. 
Bottom: the corresponding measured squeezed distributions overlaid 

on the un-squeezed distribution (measurement strengths, 0.751-2.07). 

c, Top: 96.2%-contrast Rabi oscillations executed by the squeezed state, 
observed with fluorescence imaging. The Bloch sphere represents the state 
at the indicated point. Bottom: measured J, distributions at the lowest 
point of the Rabi oscillations. The width of the un-squeezed case, shown 
for reference, is limited by camera read-out noise. 


lattice (Fig. 1a). As we will elaborate, this uniform coupling will enable 
retrieval of squeezing even if the atoms are released from the optical 
lattice—a key requirement for many cold atom sensors. In contrast to 
earlier work!?-*!4, we prepare squeezed states without resorting to 
spin-echo techniques required for non-uniformly coupled systems, 
and we base squeezing levels on the true CSS noise level of N atoms 
(instead of an inferred level using lower effective numbers of atoms). 

The core apparatus is described in ref. 26. We load the 520-1K-deep 
optical lattice with a 251K ensemble of up to 7 x 10° atoms prepared 
in the lower clock state (see Methods). Our 10.7-cm near-confocal 
cavity with a finesse of 1.75 x 10° (linewidth «9 = 8.0 kHz) yields 
C=0.78. The cavity resonance frequency is measured with a homo- 
dyne detection system (Fig. 1b, c) that is limited by photon shot noise 
down to 10 Hz noise frequencies, well below the frequency range of 
interest (~0.2-4 kHz). The form of the signal is shown in Fig. 1c. The 
probe laser frequency is stable down to cavity shifts induced by three 
spin-flips. For reference, a spin-flip causes a 5.5 Hz shift, a value cal- 
culated from the well-known atom-cavity parameters, and verified 
experimentally at the 10% level with ac-Stark shift measurements for 
a known intra-cavity probe power. 

We first calibrate our CSS noise level. With a 1/2 microwave pulse we 
bring the state to the equator on the Bloch sphere. To suppress microwave 
pulse amplitude noise, the implementation is in two steps: 1/29 — T2773 
(subscripts indicate relative phase between pulses). Subsequently, we 
probe the cavity with a 200 1s probe pulse (see Methods). We obtain a 
distribution of cavity shifts revealing the J, distribution (Fig. 2a). Since 
we measure a balanced population difference, atom number fluctuations 
(~2% r.m.s.) between different runs do not enter at a measurable level. 
Microwave rotation noise becomes noticeable towards 5 x 10° atoms. 

Next, we show that the J, measurements indeed prepare a state with 
reduced J, noise. Irrespective of the noise on the first measurement, a 
second measurement should be correlated (to within the measurement 
strength) with the first one. We quantify the measurement strength 
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Figure 3 | Measured spin-noise reduction and coherence. a, Left: 
observed spin-noise (1.m.s.) for the difference between two equal- 
strength back-to-back measurements separated by 1.1 ms (half transverse 
oscillation period). No-atom data signifies the noise floor. Solid lines, 
model fits (see Methods). Right: Mean length J of the Bloch vector 
(coherence), obtained by measuring Ramsey fringe contrasts following 
the first measurement. Loss in coherence is due to lattice induced ac-Stark 
shifts which are partially cancelled by probe ac-Stark shifts. Solid line, 
Gaussian decay fit. b, Atom number dependence of maximum observed 
spin-noise reduction with respect to CSS noise (0 dB is no reduction). 
Solid line, model fit (see Methods). Error bars and shaded regions in all 
panels, 68% statistical confidence interval for data and fits respectively. 


by the amount of differential phase shift (in radians) accumulated on 
the clock states due to probe-induced ac-Stark shifts. At high atom 
numbers, CSS noise exceeds the linear region of the homodyne sig- 
nal. Outside this region both the measurement efficiency and the 
intra-cavity probe power decrease; the former degrades achievable 
squeezing, and the latter results in varying ac-Stark shifts. To remedy 
this problem, we start with states deterministically pre-squeezed up to 
7 dB with 99% coherence (see Methods). Figure 3a shows the noise in 
the difference between two identical strength back-to-back measure- 
ments, as well as coherences, as a function of measurement strength 
(determined by incident probe power) for different atom numbers. 
Spin noise reductions at optimal measurement strengths saturate with 
atom number (Fig. 3b), albeit slightly earlier than expected, suggest- 
ing unknown sources of additional noise. States prepared by the first 
measurement are input states to any subsequent metrology experi- 
ment. Owing to the additional uncorrelated noise from the second 
measurement, these states contain 3 dB more squeezing than directly 
observed in the measurement difference (see ref. 19). The input state 
with the largest inferred metrological enhancement is at 5 x 10° atoms 
with a 7 measurement strength, which gives 20.1(3) dB (here the digit 
in parentheses represents the uncertainty in the final digit of the value) 
enhancement capability including the 0.6 dB loss from the measured 
93.2% coherence. In Methods, we show that some of the prepared 
states contain in excess of 680(35) particle entanglement (following 
refs 27, 28), and also discuss the level of anti-squeezing in Jy. 

To demonstrate a metrology example, we decrease the first meas- 
urement strength to 0.757 while increasing the second one to 2.07. 
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Figure 4 | Clock implementation. a, Allan deviation—a measure 
quantifying clock stability—of a squeezed atomic clock for 228 1s 
interrogation time at 1 Hz repetition rate (filled circles). The state is 
rotated into phase-sensitive orientation during the interrogation time. 
Dashed line, theoretical CSS noise limit; open circles, measured CSS noise 
level for the same Ramsey time. Solid line, 9.7 x 1071! s!?/./7. The Bloch 
spheres illustrate the clock sequence with squeezed states; white arrows 
depict conversion of phase jitter into population jitter. b, Inhomogeneity 
analysis: back-to-back measurement noise as a function of measurement 
separation time. No matching pattern in mean J, difference between the 
measurements is observed; only the noise is modulated. Measurement 
strengths are chosen to make the noise minima equal for different atom 
numbers. Solid lines are predictions with no free parameters. Error bars in 
both panels, 68% statistical confidence interval. 


In this configuration, coherence after the first measurement is 96.2% 
(with 2 x 1073 photons per atom free-space scattering), and the sec- 
ond measurement recovers more information. We obtain a directly 
measured metrological improvement of 18.5(3) dB with respect to 
the SQL, with which we resolve small rotations around Jy at 6.5 x 10° 
atoms (Fig. 2b, c). This corresponds to 147 rad (r.m.s.) single-shot 
phase sensitivity. 

To take full advantage of the achieved sensitivity for implementing 
an atomic clock, we need to reduce the phase noise of our micro- 
wave local oscillator. Nevertheless, here we include a preliminary 
clock demonstration. In particular, we show that we can compare 
the atomic and microwave phases to better than that allowed by CSS 
noise. Following J,-squeezing, we apply a 1/2 pulse (74 |1s) to rotate 
the noise ellipse into a phase-sensitive state, then wait for phase 
accumulation, and finally map this phase onto J, with another rota- 
tion (Fig. 4a inset). High-frequency microwave phase noise during 
the measurement interrogation period results in excess noise pro- 
portional to the atom number. To obtain the maximum quantum 
enhancement, we lower the atom numbers to 1 x 10° and observe up 
to 10.5(3) dB metrological gain in phase comparison. To put these 
measurements in context with clock performance”!, we operate with 
the largest Ramsey time (~228 jus phase accumulation) which does 
not measurably degrade the comparison. We achieve 9.7 x 107"! frac- 
tional stability at 1 s averaging time (Fig. 4a). For the fixed Ramsey 
time, the squeezed clock reaches a given precision 11.2(8) times 
faster than possible without squeezing. High-frequency local oscil- 
lator phase noise can be circumvented using interleaved clocks”’, in 
which case the full advantage of squeezing can be achieved using 
adaptive measurements*”. 

The squeezed states prepared in this work can be released from the 
confining lattice for precision sensing applications. To infer the retriev- 
able squeezing in a configuration where the second cavity measure- 
ment is replaced with fluorescence imaging in free space, we analyse 
the noise modulations observed in back-to-back cavity measurements 
(Fig. 4b) as a function of delay time between the measurements. These 
modulations are evidently due to residual atom-probe coupling 
inhomogeneities, and on the basis of our model, we anticipate retrieving 
up to 14.6 dB squeezing (see Methods). 
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METHODS 


Squeezing measurement details. The incident probe laser power is always 
increased and decreased adiabatically. The duration of the measurement pulses is 
200 1s—the shortest possible time while avoiding significant ringing in the homo- 
dyne signal. Typical incident probe powers are of the order of 10 pW (intra-cavity 
photon numbers: 90 photons per pW on resonance). Since the homodyne signal 
is time dependent, we apply a time-dependent weighting function in the analysis 
of the time traces to extract the cavity frequency shift. This method gives the same 
signal and photon shot noise values irrespective of pulse shapes and duration as 
long as the pulse area is conserved and the intra-cavity power adiabatically follows 
the incident power. Quantitatively, given a time-dependent homodyne signal s(t), 
the cavity shift is Av = Dfadt s(t) p(t)/ dt p?(t), where D is the frequency dis- 
criminator (Hz/V) at the peak signal level, and p(t) is the temporal shape of the 
signal normalized such that its peak is at 1. p(t) and D are determined experimen- 
tally in the absence of atoms by detuning the probe from the exact cavity resonance 
by a small amount (well within cavity linewidth), and recording the homodyne 
signal. 

The procedure above extracts the correct cavity shift as long as the homodyne 
signal is in the linear regime. We apply an additional correction factor to properly 
measure the shifts which are outside this linear regime. This factor is calibrated by 
setting the cavity-probe detuning to a known value and noting the discrepancy 
between the known shift and the inferred shift. This is important when establishing 
CSS noise levels. 

At large atom numbers the cavity linewidth broadens owing to atomic scattering, 

causing the squeezing saturation. The atom-number-dependent linewidth is 
K=Ko+ Ks, with Ko the empty cavity linewidth and xs the additional scattering 
contribution. The fractional change is &s/«9 = NC(I"/w yp)”. We incorporate this 
signal degrading effect into the cavity shift analysis. The change in the overall 
signal shape due to broadening is negligible for our parameter range. We experi- 
mentally verify that we indeed obtain kg/#9~ 0.30 at 5 x 10° atoms. To do this, we 
jump the cavity frequency by a known amount (smaller than «) between consec- 
utive measurements and observe the signal reduction in comparison to the empty 
cavity case. This measurement takes 10 min; it would have taken 11h to reach the 
same precision had we not employed squeezed states. 
Homodyne detection system. The homodyne detection setup (Fig. 1c) is seeded 
with 780 nm light obtained by frequency doubling the 1,560 nm lattice light, which 
is frequency stabilized to the main cavity following an intermediate stabilization 
step involving a scrubbing cavity (to narrow down the linewidth of the laser). Thus, 
the 780 nm light is already stable in the short-term with respect to the cavity 
(80 mHz/./Hz level in the 0.2-4kHz band—about 8 Hz (r.m.s.) stability). In the 
long term, thermal drifts in the cavity mirror coatings cause variations in the indi- 
vidual cavity lengths seen by the 780 nm and 1,560 nm light. During experimental 
cycles of 1s, the probe frequency can drift by 100 Hz (r.m.s.); we correct the drift 
at the end of every cycle with an auxiliary empty cavity measurement. 

Using a 200,1W local oscillator (shifted by 80 MHz from input) on path A of the 
homodyne system, the balanced detectors operate photon shot-noise limited from 
10 Hz up to 5 MHz. Two 10nW spectral components on path B (offset by 78 MHz 
and 82 MHz from input) travel to the cavity and promptly reflect back from the first 
mirror to give a heterodyne beat-note signal (2 MHz) at the detectors. This signal 
is used to stabilize the interferometer path lengths via feedback onto the AOM 
on path A. Stabilization covers the DC to 15 kHz frequency band, thus removing 
the influence of optical phase noise for the squeezing measurements. Path B also 
contains the probe (80 MHz offset from input), interfering with the local oscillator 
to form the homodyne signal after returning from the cavity. 

The overall detection efficiency limiting the achievable squeezing is e = 0.16. 
The breakdown is as follows: a factor of 0.50, since we collect light only from one 
cavity mirror; 0.57 due to loss in cavity mirrors; 0.80 backwards fibre coupling effi- 
ciency; 0.85, loss in isolator and other optical elements; 0.85, interferometer mode 
matching efficiency. Multiplying these factors results in the stated overall efficiency. 

Our ability to estimate the centroid of the cavity resonance frequency improves 
with the square root of the number of photons contained in a 2001s measurement 
pulse, and saturates around 15 Hz owing to laser frequency instability. 
Inhomogeneity analysis for free-space release. Although we have uniform atom- 
cavity coupling, we still have residual inhomogeneities due to thermal motion. 
Figure 4b shows the noise between back-to-back measurements for varying meas- 
urement separations: noise is smallest when the separation is half-integer multiples 
of the 2.2 ms transverse-oscillation period. We model this behaviour to make an 
estimate of retrievable squeezing with fluorescence imaging. 

The observed effect can be explained with the statistical fluctuations in atomic 
positions from one experimental cycle to the next. A classical trajectory analysis 
on the atoms in a harmonic trap (transverse motion) suffices to predict the noise 
in two consecutive cavity measurements as a function of measurement separation 
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time. Given their initial phase space points during the first measurement, the loca- 
tion of the atoms during the second measurement, and hence the cavity frequency 
difference, can be deterministically predicted. However, each time the experiment 
is repeated, the atoms will start at a different configuration in phase space giving 
rise to a slightly different cavity frequency difference. We calculate this additional 
noise assuming a thermal distribution for the atoms. The resulting prediction is 
2 1 1 2 ne 
(As2)"? = JN(1+ 0)(; +20? 1+2a?+4a4(1—costwt) 1+2a7+ a4sin2ut) ) , 
where w is the transverse oscillation angular frequency and a= 20,/Ww7go with 0; 
the transverse size of the atomic cloud and w7g9 the probe beam waist (a” = 0.076). 
The solid lines in Fig. 4b are the addition of this function in quadrature with the 
baseline without any free parameters. 

A similar analysis could be used to predict the additional noise that will arise 
when the first measurement is done using the cavity and the second one using 
fluorescence imaging. The second measurement is insensitive to atomic positions; 
hence the noise comes only from the statistical fluctuations in atomic positions 
during the first measurement. The estimate for the additional noise is 
(Az)? = JNa2/,/1+ 2a which is 17 dB below CSS noise for our experiment. 
In particular, an 18.5 dB squeezed state would read 14.6 dB with fluorescence 
imaging. 

Atom/cavity parameters. For transitions from the clock states with 1-polarized 
light, the single atom cooperativity is C= 4g’/Ko I’ = 0.78, with atom-cavity cou- 
pling g=2n x 96.7 kHz, empty cavity decay rate Ko = 21 x 8.0kHz, and atomic 
decay rate [’=2n x 6.06 MHz. The value for g’ is an average over the position 
distribution of the atoms inside the lattice due to the finite temperature. The 25 1K 
atoms are distributed over about 1,000 lattice sites. The r.m.s. atomic cloud size 
inside each 520-|1K-deep lattice site is 17 {1m in the transverse direction and 37 nm 
in the axial direction. The trap frequencies in the corresponding directions are 
460 Hz and 205 kHz. We vary the atom number in the experiment by changing the 
initial magneto-optical trap (MOT) loading time. Strictly speaking, the residual 
inhomogeneity in atom-cavity coupling due to thermal motion requires one to 
define effective atom numbers Nes for the purpose of identifying the CSS noise 
level. This method was adopted in refs 19, 20, 21 and 24, owing to lack of homo- 
geneity in couplings, resulting in Neg 0.66No, with No denoting the real atom 
number. In this work we achieve N = Neg © 0.995Np (see next section for details). 
For practical purposes, we do not differentiate between real and effective atoms. 

Coupling inhomogeneity. We would like to measure the collective observable 
k= yo ‘ ae with No the total number of atoms, and Sa = 5o\"the z-component 


of the spin operator for atom n. However, because of the residual inhomogeneity 
in the atom-cavity coupling we measure a slightly different collective observable 
J; that is a weighted sum over oe 

If all atoms coupled identically to the cavity with the per-spin-flip cavity 
frequency shift for atom n given by 6) = 69, the total shift would be 
Ao =%,,60 ce = 6boJ;. However, owing to the small fractional deviations ¢,, in the 
coupling constants for the atoms n, the total shift is given by: 


A= S80" = F0(1 — en) fe” = (6022) 


1 n 
gol caf | =a. 


n 


Here Z is a normalization constant, and d= 5oZ is the effective cavity shift per 
spin flip. To decide on a normalization, we utilize two properties of J!: its maxi- 
mum (J!) max = ®o 111 —e), and its projection noise with uncorrelated atoms 


var(J!)proj = No (1 —e)*).. Here (¢). indicates an ensemble average. We 


“4 Z2 
choose Z such that the statistical condition aa = pari = >} Satisfied by J, is also 
z)max 0 


satisfied by Ji. This leads to Z = ae and thus: 
—Ede 


,_ (l-€)e 
Bae, 


Consequently, one can think of the non-uniformly coupled system of No atoms 
a2 

x “ = 0.995Np effective 

(=e) 


atoms in conjunction with an effective cavity shift beg = 5 = 0.93659 per 


spin-flip. Here do is the cavity shift for an atom on the cavity axis averaged over the 
distribution along the tightly trapped longitudinal direction. This gives 
bof¢= 0.83 dmax = 5.5 Hz, where dmax is the cavity shift for an atom localized at a peak 
of the probe mode profile. 

Fluorescence imaging. Using the cavity, we detect J, only in a very narrow range. 
For observing J, in its full range of +-N/2 we use fluorescence imaging. We release 


—€En)j, 


re 


as a uniformly coupled system of N= Neg = No 
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the atoms from the lattice, and for state selectivity, we apply a laser beam resonant 
with the F=2 to F’=3 transition to momentarily push the F= 2 (that is,| {)) atoms. 
This spatially separates the|{) and| |) states, permitting the simultaneous detection 
of the number of atoms in both states. We image the fluorescence from the two 
clouds for 2ms on a CCD camera using the cooling and re-pumping lights of the 
MOT. This is how the observed Rabi and Ramsey fringes are mapped out and hence 
how the coherence is measured. 

The pre-squeezing procedure. The pre-squeezing procedure is a supplement to 
the J, measurement protocol, increasing robustness and efficiency. However we 
note that it does not alter the nature of J, measurements—we obtain the same final 
squeezing results in the absence of pre-squeezing if we post-select the runs with 
the first measurement outcome lying within the linear regime of the homodyne 
signal. 

When approaching the saturated regime of squeezing, discussed in the main 
text, the r.m.s. cavity shifts due to CSS noise approach the linewidth of the cavity. 
This quantum noise prevents us from preparing initial J, distributions that fall 
purely within the linear region of the homodyne signal using microwave rotations 
alone. We therefore resort to atom-cavity nonlinearities to deterministically pre- 
squeeze the state by a sufficient amount such that the distribution fits within the 
linear regime. The nonlinearity employed is a J : interaction causing one-axis twist- 
ing, similar to the one used in ref. 20. For an initial state near the J, axis, the pro- 
cedure dynamically compresses the distribution on the Bloch sphere in the 
z direction while expanding it in the y direction. The technical noise on the initial 
J, preparation is also suppressed. The pre-squeezing occurs after the composite 
1/2 pulse brings the state to the equator. We send 100nW of light at 6.2549 detun- 
ing from the bare cavity resonance (generating J? interaction) and simultaneously 
turn on a 400|1s microwave 7/12 pulse (generating rotations around the J, axis). 
With the combined action, we observe up to 7 dB unconditional J, squeezing while 
retaining 99% coherence. 

Amount of anti-squeezing. The cavity measurements, while projecting the 
atomic ensemble onto a state with reduced J, noise, also act back onto the conju- 
gate observable J, and increase its noise. In the ideal case a measurement would 
preserve the area of the uncertainty ellipse, that is, the reduction and the increase 
in the noises of J, and J, respectively would be through the same factor. However 
owing to photon losses, inefficiencies in extracting the information in the read- 
out, and the additional spin-flip noise in J,, the balance is expected to be broken. 
Experimentally, the variance of J, scales linearly with measurement strength, 
reaching 39 dB above CSS noise at 7 measurement strength accompanying the 
quoted 20.1(3) dB squeezing in J,. This corresponds to a factor of 8.8 increase in 
the uncertainty ellipse area. The observed level of anti-squeezing is within 2 dB 
of the expectations. 

Squeezing limits calculation. The maximum attainable spin noise reduction can be 
found by considering the individual effects of the measurement noise and the spin-flip 
noise. We will specify these quantities as functions of the (experimentally accessible) 
differential phase shift d,. = ait ai dt n_(t) accumulated on the clock states. Here, 
n(t) is the intra-cavity photon number. We will also use the cooperativity, 

4 

C= o> 

The number of scattered photons is ms= @ac(I/wue). The hyperfine splitting 
enters because the atom-cavity detuning is set toA = wyp/2. Using the branching 
ratios for ®’Rb, it can be shown that only 1/6 of the scattering events will give rise 
to a spin-flip, that is, a change of the hypemine state, ere spin-flips will give rise 
to a random walk on J, with a variance of Sfp = vt ace This is the spin-flip 
noise; it grows with atom number and probe power. 

To examine the measurement noise, we analyse the information imprinted on 
the light transmitted from the cavity. The total decay rate of the cavity is 
K=2km+t K+ Ks, where ky is due to mirror out-coupling, and , is due to opti- 
cal losses in the mirrors. The term due to atomic scattering can be expressed as 
ks/Ko=NC(I]/wyp)?, where ko = 2K + Ky. Around zero cavity— es detuning, 
the number of photons transmitted through the cavity is ny = = (oue/T) pele) , 
where €.=24y/ko is the cavity efficiency, incorporating the optical losses. As the 
atoms shift the cavity frequency by 2g? Se the phase shift on the light upon 


transmission is y = 2C Tour) _ 7 Given the quantum phase noise fora 
1+NC(L/wyp)? 2 fap 
coherent state, the noise equivalent J, resolution is ——/=8F_ BONG — we)” . Fora 


symmetric cavity, equal amounts of information leak out from each mirror. orn 
including the information gained from the reflection would improve the resolution 
by af 2: Lastly, we also include the effect of photon losses on the way to the detectors, 
and bundle all efficiency factors into the quantity ¢ which is further discussed in 
the ‘Atom/cavity parameters’ section above. The final expression for the measure- 
_ (+ NCL /wyp)*)? 1 


ment noise is 62 
16eC (I/wyr) bac 


meas — 
power. 

Asa first approximation, the total noise in the estimation of J, can be found by 
adding the contribution due to the two sources: 6? = 6° seas + Satip An optimal dac 
(that is, measurement strength) minimizes this expression, at which point 

ny 
gate), Assuming negligible coherence loss, as is the case exper- 
N24eC 


ENC 3/2 
imentally, we arrive at an optimal metrological enhancement Xopt = eee 
1+NC(I/wyp 


At first, the achievable enhancement increases with atom number, attaining a max- 
imum of (wyp/T)[3e/8 at Nopt = wip /2C. This saturation effect can be traced 
back to cavity linewidth broadening from atomic absorption. For ¢= 1, the max- 
imum achievable enhancement is around 28 dB. 

Exact numerical agreement should not be expected between the naive model 
presented here and the experiment, since the latter is more complicated. We use 
the functional forms derived here to fit to the data in re 3. In particular, for 

ANC~ 
~ YT ENC /oup? 


; it decreases with increasing probe 


Fig. 3a we use 67 = and for Fig. 3b we use x? 


opt 
Here, a, Gand yare - fit parameters. 
Entanglement depth. Measurements of the collective spin operators J; = ye 1 i” 
for an ensemble of N two-level atoms can be used to quantify the amount of entan- 
glement in the ensemble”””* without further reference to the specific nature of the 
states. We will follow the analysis of ref. 28, where the J, variance and the mean- 
square Bloch vector length places the measured states on a plane with boundaries 
corresponding to different entanglement depths. 

In our experiment, because of the residual inhomogeneities in pee cavity 


ype n=1 ( 1—€n ye 
Z is a normalization constant and ¢, is the small fractional deviation in coupling 
for atom n (see ‘Coupling inhomogeneity’ section). This implies that we cannot 
directly utilize the measured spin noise values for the purposes of calculating 
entanglement depths. However, even without a direct measurement of J, itself, its 
maximum possible variance can be inferred, as was argued in the Methods section 
‘Inhomogeneity analysis for free-space release. There, we found that there would 
be an additive noise of 17 dB below CSS noise if we tried to read out the prepared 
states via fluorescence imaging, which we consider to be a true J, measurement. 
Therefore, we infer J, on the basis of this analysis. Unlike the cavity-based meas- 
urements, the Bloch vector length measurements, which are done via fluorescence 
imaging, can directly be used for calculating entanglement depths. 

In Extended Data Fig. 1, we plot the inferred J, variances for the 5 x 10° atom 
data set using the experimentally established noise after the first measurement 
(z of the values in Fig. 3a). The point with the largest metrological gain (1 meas- 
urement strength) gives an entanglement depth of 330(15) atoms, while the largest 
entanglement depth is 680(35) (0.5% measurement strength) atoms. This exem- 
plifies that entanglement depth is in itself not a direct predictor for metrological 
improvement. 

The additional noise in our model in inferring J, originates from shot-to-shot 
randomization of atomic positions. In the absence of this randomization, we expect 
the discrepancy between the variances of J}and J, to be less. Thus, the quoted 
entanglement depths should be taken as lower bounds. For reference, had we not 
taken into account the coupling inhomogeneity we would have found the largest 
entanglement depth to be 1,605(30) atoms. 

Sample size. No statistical methods were used to predetermine sample size. 


coupling, we measure the collective observable I = ; , Where 
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Extended Data Figure 1 | Inferred entanglement depths, quantifying 
multi-particle entanglement. The inferred spin noise variance (y axis) 
and the mean-square Bloch vector lengths (x axis) are plotted for the 

5 x 10° atom data set. Note that the probe power decreases from left 

to right. The x-axis values are conservatively chosen to be the most 
probable value of the measured Bloch vector length distributions 

(Fig. 2c). A state below an M-particle boundary (purple lines labelled 
with particle numbers) is guaranteed to contain at least groups of 

M particles whose quantum states are non-separable. The blue data set 
establishes a lower bound on entanglement depth taking into account the 
residual inhomogeneity in atom-cavity coupling. The red data set, for 
reference, shows what we would have obtained had we ignored the small 
inhomogeneity. The ellipses correspond to the 68% statistical confidence 
intervals on the quoted values. Jmax = N/2. The third data point in each set 
shows the largest metrological improvement. 
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Fully integrated wearable sensor arrays for 
multiplexed in situ perspiration analysis 


Wei Gao!3*, Sam Emaminejad'*?-4*, Hnin Yin Yin Nyein?*, Samyuktha Challa*, Kevin Chen!*%, Austin Peck®, 
Hossain M. Fahad!3, Hiroki Otal??, Hiroshi Shiraki!*?, Daisuke Kiriyal**, Der-Hsien Lien?*, George A. Brooks®, 


Ronald W. Davis* & Ali Javey!? 


Wearable sensor technologies are essential to the realization of 
personalized medicine through continuously monitoring an 
individual’s state of health!-!?. Sampling human sweat, which is 
rich in physiological information'’, could enable non-invasive 
monitoring. Previously reported sweat-based and other non- 
invasive biosensors either can only monitor a single analyte 
at a time or lack on-site signal processing circuitry and sensor 
calibration mechanisms for accurate analysis of the physiological 
state'*"!8, Given the complexity of sweat secretion, simultaneous and 
multiplexed screening of target biomarkers is critical and requires 
full system integration to ensure the accuracy of measurements. 
Here we present a mechanically flexible and fully integrated (that 
is, no external analysis is needed) sensor array for multiplexed 
in situ perspiration analysis, which simultaneously and selectively 
measures sweat metabolites (such as glucose and lactate) and 
electrolytes (such as sodium and potassium ions), as well as the 
skin temperature (to calibrate the response of the sensors). Our 
work bridges the technological gap between signal transduction, 
conditioning (amplification and filtering), processing and 
wireless transmission in wearable biosensors by merging plastic- 
based sensors that interface with the skin with silicon integrated 
circuits consolidated on a flexible circuit board for complex signal 
processing. This application could not have been realized using 
either of these technologies alone owing to their respective inherent 
limitations. The wearable system is used to measure the detailed 
sweat profile of human subjects engaged in prolonged indoor and 
outdoor physical activities, and to make a real-time assessment of 
the physiological state of the subjects. This platform enables a wide 
range of personalized diagnostic and physiological monitoring 
applications. 

Wearable electronics are devices that can be worn or mated with 
human skin to continuously and closely monitor an individual's activ- 
ities, without interrupting or limiting the user’s motions'~®. Thus 
wearable biosensors could enable real-time continuous monitoring 
of an individual’s physiological biomarkers'°"'”. At present, com- 
mercially available wearable sensors are only capable of tracking an 
individual's physical activities and vital signs (such as heart rate), and 
fail to provide insight into the user’s health state at molecular levels. 
Measurements of human sweat could enable such insight, because 
it contains physiologically and metabolically rich information that 
can be retrieved non-invasively’’. Sweat analysis is currently used for 
applications such as disease diagnosis, drug abuse detection, and ath- 
letic performance optimization. For these applications, the sample 
collection and analysis are performed separately, failing to provide a 
real-time profile of sweat content secretion, while requiring extensive 
laboratory analysis using bulky instrumentation’. Recently, wearable 
sweat sensors have been developed, with which a variety of biosensors 


have been used to measure analytes of interest (Supplementary 
Table 1)!4-18, 

Given the multivariate mechanisms that are involved in sweat secre- 
tion, an attractive strategy would be to devise a fully integrated mul- 
tiplexed sensing system to extract the complex information available 
from sweat. Here we present a wearable flexible integrated sensing array 
(FISA) for simultaneous and selective screening of a panel of biomark- 
ers in sweat (Fig. 1a). Our solution bridges the existing technological 
gap between signal transduction (electrical signal generation by sen- 
sors), conditioning (here, amplification and filtering), processing (here, 
calibration and compensation) and wireless transmission in wearable 
biosensors by merging commercially available integrated-circuit tech- 
nologies, consolidated on a flexible printed circuit board (FPCB), with 
flexible and conforming sensor technologies fabricated on plastic sub- 
strates. This approach decouples the stringent mechanical requirements 
at the sensor level and electrical requirements at the signal condition- 
ing, processing and transmission levels, and at the same time exploits 
the strengths of the underlying technologies. The independent and 
selective operation of individual sensors is preserved during multi- 
plexed measurements by employing highly specific surface chemistries 
and by electrically decoupling the operating points of each sensor’s 
interface. This platform is a powerful tool with which to advance large- 
scale and real-time physiological and clinical studies by facilitating the 
identification of informative biomarkers in sweat. 

As illustrated in Fig. 1a, the FISA allows simultaneous and selec- 
tive measurement of a panel of metabolites and electrolytes in human 
perspiration as well as skin temperature during prolonged indoor and 
outdoor physical activities. By fabricating the sensors on a mechanically 
flexible polyethylene terephthalate (PET) substrate, a stable sensor- 
skin contact is formed, while the FPCB technology is exploited to 
incorporate the critical signal conditioning, processing, and wireless 
transmission functionalities using readily available integrated-circuit 
components (Fig. 1b). The panel of target analytes and skin temperature 
was selected to facilitate an understanding of an individual's physio- 
logical state (see Supplementary Information for selection of the tar- 
get analytes). For example, excessive loss of sodium and potassium in 
sweat could result in hyponatremia, hypokalemia, muscle cramps or 
dehydration”®; sweat glucose is reported to be metabolically related to 
blood glucose”!; sweat lactate can potentially serve as a sensitive marker 
of pressure ischaemia”; and skin temperature is clinically informative 
of a variety of diseases and skin injuries such as pressure ulcers”**. 
Additionally, skin temperature measurements are needed to compen- 
sate for and eliminate the influence of temperature variation in the 
readings of the chemical sensors through a built-in signal processor. 

Figure 1c illustrates the schematic of the multiplexed sensor array 
(each electrode is 3 mm in diameter) for sweat analysis; fabrication 
processes are detailed in Methods and Extended Data Fig. 1. Here, 
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Flexible sensor array 


e 


Figure 1 | Images and schematic illustrations of the FISA for 
multiplexed perspiration analysis. a, Photograph of a wearable FISA 

on a subject’s wrist, integrating the multiplexed sweat sensor array and 
the wireless FPCB. (All photographs in this paper were taken by the 
authors.) b, Photograph of a flattened FISA. The red dashed box indicates 
the location of the sensor array and the white dashed boxes indicate the 
locations of the integrated circuit components. c, Schematic of the sensor 
array (including glucose, lactate, sodium, potassium and temperature 
sensors) for multiplexed perspiration analysis. GOx and LOx, glucose 


amperometric glucose and lactate sensors (with current output) are 
based on glucose oxidase and lactate oxidase immobilized within a 
permeable film of the linear polysaccharide chitosan. A Ag/AgCl elec- 
trode serves as a shared reference electrode and counter electrode for 
both sensors. The use of Prussian blue dye as a mediator minimizes the 
reduction potentials to approximately 0 V (versus Ag/AgCl) (Extended 
Data Fig. 2a), and thus eliminates the need for an external power source 
to activate the sensors. These enzymatic sensors autonomously generate 
current signals proportional to the abundance of the corresponding 
metabolites between the working electrode and the Ag/AgCl electrode. 
The measurement of Na* and K* levels is facilitated through the use of 
ion-selective electrodes (ISEs), coupled with a polyvinyl butyral (PVB)- 
coated reference electrode to maintain a stable potential in solutions 
with different ionic strengths (Extended Data Fig. 2b-d). By using 
poly(3,4-ethylenedioxythiophene) polystyrene sulfonate (PEDOT:PSS) 
as an ion-to-electron transducer in the ISEs and carbon nanotubes in 
the PVB reference membrane’, robust potentiometric sensors (with 
voltage output) can be obtained for long-term continuous measure- 
ments with negligible voltage drift. A resistance-based temperature sen- 
sor is realized by fabricating Cr/Au metal microwires. Parylene is used 
as an insulating layer to ensure reliable sensor reading by preventing 
electrical contact of the metal lines with skin and sweat. 

Figure 1d illustrates the system-level overview of the signal 
transduction, conditioning, processing, and wireless transmission 
paths to facilitate multiplexed on-body measurements. The signal- 
conditioning path for each sensor is implemented with analogue 
circuits and in relation to the corresponding transduced signal. The 
circuits are configured to ensure that the final analogue output of each 
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oxidase and lactate oxidase. d, System-level block diagram of the FISA 
showing the signal transduction (orange) (with potential V, current I 
and resistance R outputs), conditioning (green), processing (purple) and 
wireless transmission (blue) paths from sensors to the custom-developed 
mobile application (numbers in parentheses indicate the corresponding 
labelled components in b). ADC, analogue-to-digital converter. The inset 
images show the home page (left) and the real-time data display page 
(right) of the mobile application. 


path is finely resolved while staying within the input voltage range of 
the analogue-to-digital converter. Furthermore, the microcontroller’s 
computational and serial communication capabilities are used to cal- 
ibrate, compensate, and relay the conditioned signals to an on-board 
wireless transceiver. The transceiver facilitates wireless data transmis- 
sion to a Bluetooth-enabled mobile handset with a custom-developed 
application (Extended Data Fig. 3), containing a user-friendly interface 
for sharing (through email, SMS, and so on) or uploading the data 
to cloud servers. The circuit design, calibration, and power delivery 
diagram of the FISA are described in Methods and Extended Data 
Figs 4 and 5. 

The performance of each sensor was monitored separately with 
different analyte solutions. Figure 2a and b shows the representative 
current responses of the glucose and lactate sensors, measured chrono- 
amperometrically in 0-200-\1.M glucose solutions and 0-30-mM lactate 
solutions, respectively. A linear relationship between current and ana- 
lyte concentrations with sensitivities of 2.35 nA|1M_! for glucose sen- 
sors and 220nA mM “' for lactate sensors was observed. Figure 2c and d 
illustrates the open circuit potentials of Na* and K* sensors in the 
electrolyte solutions with physiologically relevant concentrations of 
10-160-mM Na‘ and 1-32-mM Kt respectively. Both ion-selective 
sensors show a near-Nerstian (according to the Nerstian equation, the 
theoretical sensitivity of the ISE-based sensors should be 59) behaviour 
with sensitivities of 64.2 mV and 61.3 mV per decade of concentration 
for Na* and K* sensors, respectively. Results of repeatability and long- 
term stability studies indicate that the sensitivities of the biosensors are 
consistent over a period of at least four weeks (Extended Data Fig. 6). 
Figure 2e displays the linear response of the resistive temperature 
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Figure 2 | Experimental characterizations of the wearable sensors. 

a, b, The chronoamperometric responses of the glucose (a) and lactate 
(b) sensors to the respective analyte solutions in phosphate-buffered 
saline (PBS). c, d, The open circuit potential responses of the sodium (c) 
and potassium (d) sensors in NaCl and KCI solutions. e, The resistance 
response of the temperature sensor to temperature changes (20-40 °C) in 


sensor in the physiological skin temperature range of 20-40°C with a 
sensitivity of approximately 0.18% per degree Celsius (normalized to 
the resistance at 20°C). 

The selectivity of sweat sensors is crucial, because various electro- 
lytes and metabolites in sweat can influence the accuracy of the sensor 
readings. Extended Data Fig. 7a—d shows that the presence of non- 
target electrolytes and metabolites causes negligible interference to the 
response of each sensor. When all five sensors are integrated in the 
FISA, simultaneous system-level measurements maintain excellent 
selectivity upon varying concentrations of each analyte (Fig. 2f and 
Extended Data Fig. 7e-h). Although temperature has a minimal effect 
on the potentiometric sensors, it greatly influences the performance 
of the enzymatic sensors. Figure 2g shows that the responses of glu- 
cose and lactate sensors increase rapidly upon elevation of the solu- 
tion temperature from 22°C to 40°C, reflecting the effect of increased 
enzyme activities*®, System integration allows for the implementation 
of real-time compensation to calibrate the sensor readings on the basis 
of temperature variations. Figure 2h illustrates that with the increase 
of temperature, the uncompensated sensor readouts can lead to sub- 
stantial overestimation of the actual concentration of the given glucose 
and lactate solutions; however, the temperature compensation allows 
for accurate and consistent readings. 

It is essential for wearable devices to be able to withstand the stress 
of daily human wear and physical exercise. A study on mechanical 
deformation conducted by monitoring the performance of both the 
sensor array and the FPCB before, during and after bending (radii of 
curvature are 1.5cm and 3 cm, respectively) (Extended Data Fig. 8) 
reveals minimal output changes in the FISA’s responses. 

The FISAs can be comfortably worn on various body parts, includ- 
ing the forehead, wrists and arms. Figure 3a shows a human subject 
wearing two FISAs, packaged as a ‘smart wristband’ and a ‘smart 
headband, allowing for real-time perspiration monitoring on the 
wrist and forehead simultaneously during stationary leg cycling. To 
ensure the fidelity of sensor readings, the data collection of each chan- 
nel took place when a sufficient sweat sample was present, as shown 
by stabilization of sensor readings within the physiologically relevant 
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PBS. Insets in a-e show the corresponding calibration plots of the sensors. 
Data recording was paused for 30s for each solution change in a-e. 

f, System-level interference studies of the sensor array. g, The influence of 
temperature on the responses of the glucose and lactate sensors. h, System- 
level real-time temperature T compensation for the glucose and lactate 
sensors in 100-|1M glucose and 5-mM lactate solutions, respectively. 


range (see Methods). The accuracy of on-body measurements was 
verified through the comparison of on-body sensor readings from 
the forehead with ex situ (off-body) measurements from collected 
sweat samples (Fig. 3b). 

Real-time physiological monitoring was performed on a subject 
during constant-load exercise on a cycle ergometer. The protocol 
involved a 3-min ramp-up, 20-min cycling at 150 W, and a 3-min 
cool-down. During the exercise, the heart rate, oxygen consumption 
(Vo,), and pulmonary minute ventilation were measured using exter- 
nal monitoring instruments, and were found to increase proportion- 
ally with increasing power output as shown in Fig. 3c. Figure 3d 
illustrates the corresponding real-time measurements on the subject's 
forehead using a FISA. The skin temperature remains constant at 
34°C up to perspiration initiation at about 320s. The dip in temper- 
ature at this point indicates the beginning of perspiration and evap- 
orative cooling’. With continued perspiration, the skin temperature 
rises at about 400s because of muscle heat conductance to skin and 
then remains stable, while the concentration of both lactate and glu- 
cose in sweat decrease gradually. The decreases in concentration of 
lactate and glucose in sweat are expected, owing to the dilution effect 
caused by an increase in sweat rate, which is visually observed as 
exercise continues!*. However, lactate concentration becomes rela- 
tively stable after 1,100 s, indicating the stabilization of physiological 
responses to continuous, sub-maximal constant-exercise power out- 
put”. Sweat [Na‘] increases and [K*] decreases in the beginning of 
perspiration, in line with the previous ex situ studies from the col- 
lected sweat samples”®*”?. Both [Na] and [K*] stabilize as the cycling 
continues. By wearing a FISA on different parts of the body, the 
site-specific variations in electrolyte and metabolite levels”? can also 
be monitored and studied simultaneously. Sweat analyte levels on the 
wrist follow similar trends but with concentrations different from 
those obtained at the forehead (Extended Data Fig. 9). In this case, 
because the subject had a lower sweat rate at the wrist”, the sensors 
were activated at a later time. 

The physiological response of the subjects to a sudden change in exer- 
cise intensity was also investigated, in a graded-load exercise which 
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Figure 3 | On-body real-time perspiration analysis during stationary 
cycling. a, Photographs of a subject wearing a ‘smart headband’ and a 
‘smart wristband’ during stationary cycling. b, Comparison of ex situ 
calibration data of the sodium and glucose sensors from the collected 
sweat samples with the on-body readings of the FISA during the stationary 
cycling exercise detailed in f. c, d, Constant-load exercise at 150 W: power 
output, heart rate (in beats per minute, b.p.m.), oxygen consumption (Vo,) 
and pulmonary minute ventilation, as measured by external monitoring 


involved a 5-min rest, 20-min cycling at 75 W followed by cycling at 
200 W power output until volitional fatigue, and a 10-min recovery 
period (Fig. 3e and f). As demonstrated in Fig. 3e, the dramatic increase 
in the exercise power output from 75 W to 200 W immediately leads to 
abrupt elevations of heart rate, minute ventilation and Vo,. Responses 
of the FISA during 75-W power output follow profiles similar to those 
observed during the constant-load study. After the power is raised, the 
sweat rate visibly increases, followed by a sharp increase in skin temper- 
ature and sweat [Na] as well as a slight increase in [K*] (in three of the 
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systems (c) and the real-time sweat analysis results of the FISA worn ona 
subject’s forehead (d). e, f, Graded-load exercise, involving a dramatic 
power increase from 75 W to 200 W: power output, heart rate, Vo, and 
pulmonary minute ventilation, as measured by external monitoring 
systems (e) and the real-time analysis results using the FISA worn ona 
different subject’s forehead (f). Data collection for each sensor took place 
when a sufficient sweat sample was present (see Methods). 


seven subjects, [K*] remained stable). The relatively stable behaviour of 
[K*] is explained by its passive ion partitioning mechanism)’. With the 
cessation of exercise, these physiological responses decrease and then 
remain stable. No apparent difference is observed for glucose concen- 
tration at different power output settings, a finding consistent with the 
response of blood glucose to graded, short-term exercise*’. The change 
in lactate concentration, on the other hand, varies between subjects. This 
observation can be attributed to the increase in both lactate excretion 
rate and sweat rate upon the increase of the workload?!. 
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using the FISAs. a, Schematic illustration showing the group outdoor 


running trial based on wearable FISAs (packaged as ‘smart headbands’). 
The data are transmitted to the user’s cell phone and uploaded to cloud 


Monitoring hydration status is of the utmost importance to athletes 
because fluid deficit impairs endurance performance and increases car- 
bohydrate reliance*’. To evaluate the utility of a FISA for effective and 
non-invasive identification of dehydration, real-time sweat [Nat] and 
[K*] measurements were conducted simultaneously on a group of sub- 
jects engaged in prolonged outdoor running trials (Fig. 4a). Figure 4b 
and c shows that sweat [Na*] and [K*] are stable throughout running 
in euhydration trials (with water intake of 150 ml per 5 min) after the 
initial [Na‘] increase and [K*] decrease. On the other hand, a sub- 
stantial increase in sweat [Na*] and a smaller increase in sweat [K*] 
(no clear increase in [K*] was observed in two out of six subjects) 
were observed in dehydration trials (without water intake) after 80 min 
when subjects had lost a large amount of water (~2.5% of body weight) 
(Fig. 4d and e). Ex situ measurements of [Na*] and [K*] from collected 
sweat samples in Extended Data Fig. 10 also show similar phenomena. 
These trends are probably caused by increased blood serum [Na*] and 
[K*] with dehydration and increased neural stimulation, a conclusion 
in agreement with previous ex situ sweat analyses*’. Thus, sweat [Nat] 
can potentially serve as an important biomarker for monitoring dehy- 
dration. We believe that this wearable platform may enable new funda- 
mental physiology studies through further on-body evaluation. 

Thus, we have merged skin-conforming plastic-based sensors (five 
different sensors) and conventional commercially available integrated- 
circuit components (more than ten chips) at an unprecedented level of 
integration, not only to measure the output of an array of multiplexed 
and selective sensors, but also to obtain an accurate assessment via 
signal processing of the physiological state of the human subjects. This 
application could not have been realized by either of the technologies 


Time (s) 
servers. b, c, Representative real-time sweat sodium (b) and potassium (c) 
levels during an endurance run with water intake. d, e, Representative real- 


time sweat sodium (d) and potassium (e) levels during an endurance run 
without water intake. 


(flexible sensors and silicon integrated circuits) alone, owing to their 
respective inherent limitations. The plastic-based device technologies 
lack the ability to implement sophisticated electronic functionalities 
for critical signal conditioning and processing. On the other hand, 
the silicon integrated-circuit technology does not provide sufficiently 
large active areas nor the intimate skin contact required to achieve 
stable and sensitive on-body measurements. Importantly, the entire 
system is mechanically flexible, thus delivering a practical wearable 
sensor technology that can be used for prolonged indoor and outdoor 
physical activities. This platform could be exploited or reconfigured 
for in situ analyses of other biomarkers within sweat and other human 
fluid samples to facilitate personalized and real-time physiological and 
clinical investigations. We envision that the large data sets that could 
be collected through such studies, along with voluntary community 
participation, would enable data-mining techniques with which to gen- 
erate predictive algorithms for understanding the health status and 
clinical needs of individuals and society. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 

Materials. Selectophore grade sodium ionophore X, bis(2-ethylehexyl) seba- 
cate (DOS), sodium tetrakis[3,5-bis(trifluoromethyl)phenyl] borate (Na-TFPB), 
high-molecular-weight polyvinyl chloride (PVC), tetrahydrofuran, valinomycin 
(potassium ionophore), sodium tetraphenylborate (NaTPB), cyclohexanone, 
polyvinyl butyral resin BUTVAR B-98 (PVB), sodium chloride (NaCl), 3,4- 
ethylenedioxythiophene (EDOT)), poly(sodium 4-styrenesulfonate) (NaPSS), glu- 
cose oxidase (from Aspergillus niger), chitosan, single-walled carbon nanotubes, 
iron (IID) chloride, potassium ferricyanide (III), multiwall carbon nanotubes 
and block polymer PEO-PPO-PEO (F127) were obtained from Sigma Aldrich. 
L-lactate oxidase (>80 activity units per milligram) was procured from Toyobo 
Corp. and PBS (pH 7.2) was obtained from Life Technologies. Moisture-resistant 
100-j1m-thick PET was purchased from McMaster-Carr. 

Fabrication of electrode arrays. The fabrication process of the electrode arrays is 
detailed in Extended Data Fig. 1. Briefly, the sensor arrays on PET were patterned 
by photolithography using positive photoresist (Shipley Microposit $1818) fol- 
lowed by 30 nm Cr/50 nm Au deposited via electron-beam evaporation and lift-off 
in acetone. A 500-nm parylene C insulation layer was then deposited in a SCS 
Labcoter 2 Parylene Deposition System. Subsequently, photolithography was used 
to define the final electrode area (3 mm diameter) followed by O; plasma etching 
for 450s at 300 W to remove the parylene completely. Electron-beam evaporation 
was then performed to pattern 180-nm Ag onto the electrode areas, followed by 
lift-off in acetone. The Ag patterns on working electrode area were dissolved in a 
6-M HNO; solution for 1 min. The Ag/AgCl reference electrodes were obtained 
by injecting 10 11 0.1-M FeCl]; solution on top of each Ag reference electrode using 
a micropipette for 1 min. 

Design of electrochemical sensors. For amperometric glucose and lactate sen- 
sors, a two-electrode system where Ag/AgCl acts as both reference and counter 
electrode was chosen to simplify circuit design and to facilitate system integration. 
The two-electrode system is a common strategy for low-current electrochemical 
sensing**>, The output currents (between the working electrode and the Ag/AgCl 
reference/counter electrode) of the glucose and lactate sensors could be converted 
to a voltage potential through a transimpedance amplifier. It is known that amper- 
ometric sensors with larger area provide larger current signal. Considering the low 
concentration of glucose in sweat, we designed the sensors to be 3 mm in diameter 
to obtain a high current. 

Preparation of Na* and K* selective sensors. The Na* selective membrane cock- 
tail consisted of Na ionophore X (1% weight by weight, w/w), Na-TFPB (0.55% 
w/w), PVC (33% w/w), and DOS (65.45% w/w). 100 mg of the membrane cocktail 
was dissolved in 660 il of tetrahydrofuran!”. The K*-selective membrane cocktail 
was composed of valinomycin (2% w/w), NaTPB (0.5%), PVC (32.7% w/w), and 
DOS (64.7% w/w). 100 mg of the membrane cocktail was dissolved in 3501] of 
cyclohexanone. The ion-selective solutions were sealed and stored at 4°C. The 
solution for the PVB reference electrode was prepared by dissolving 79.1 mg PVB 
and 50 mg of NaCl into 1 ml methanol**. 2mg F127 and 0.2 mg of multiwall carbon 
nanotubes were added into the reference solution to minimize the potential drift”. 

Poly(3,4-ethylenedioxythiophene) PEDOT:PSS was chosen as the ion- 
electron transducer to minimize the potential drift of the ISEs*” and deposited onto 
the working electrodes by galvanostatic electrochemical polymerization with an 
external Ag/AgCl reference electrode from a solution containing 0.01-M EDOT 
and 0.1-M NaPSS. A constant current of 14;1A (2mA cm ~*) was applied to produce 
polymerization charges of 10 mC onto each electrode. 

Ion-selective membranes were then prepared by drop-casting 10 11 of the Nat- 
selective membrane cocktail and 411 of the K*-selective membrane cocktail onto 
their corresponding electrodes. The common reference electrode for the Nat and 
K* ISEs was modified by casting 101 of reference solution onto the Ag/AgCl 
electrode. The modified electrodes were left to dry overnight. The sensors could 
be used without pre-conditioning (with a small drift of ~2-3 mVh™ !) However, to 
obtain the best performance for long-term continuous measurements such as dehy- 
dration studies, the ion-selective sensors were covered with a solution containing 
0.1-M NaCl and 0.01-M KCl through microinjection (without contact to glucose 
and lactate sensors) for 1h before measurements. This conditioning process was 
important to minimize the potential drift further. 

Preparation of lactate and glucose sensors. 1% chitosan solution was first prepared 
by dissolving chitosan in 2% acetic acid and magnetic stirring for about 1 h; next, the 
chitosan solution was mixed with single-walled carbon nanotubes (2 mgml~!) by 
ultrasonic agitation over 30 min to prepare a viscous solution of chitosan and carbon 
nanotubes. To prepare the glucose sensors, the chitosan/carbon nanotube solution 
was mixed thoroughly with glucose oxidase solution (10mgml! in PBS of pH7.2) 
in the ratio 2:1 (volume by volume). A Prussian blue mediator layer was deposited 
onto the Au electrodes by cyclic voltammetry from 0 V to 0.5 V (versus Ag/AgCl) 
for one cycle at a scan rate of 20mVs_! ina fresh solution containing 2.5 mM 
FeCl;, 100 mM KCl, 2.5mM K3Fe(CN).¢, and 100mM HCL A thinner Prussian 
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blue layer can provide better sensitivity, which is essential for low-glucose-level 
measurements in sweat. The glucose sensor was obtained by drop-casting 3 1l of the 
glucose oxidase/chitosan/carbon nanotube solution onto the Prussian blue/Au elec- 
trode. For the lactate sensors, the Prussian blue mediator layer was deposited onto 
the Au electrodes by cyclic voltammetry from —0.5 V to 0.6 V (versus Ag/AgCl) for 
10cycles at 50mVs~! ina fresh solution containing 2.5mM FeCl, 100 mM KCl, 
2.5mM K3Fe(CN)¢, and 100mM HCL. A thicker Prussian blue layer can provide a 
wider linear response range, which is crucial for lactate measurement in sweat. 3 11 
of the chitosan/carbon nanotube solution was drop-cast onto the Prussian blue/Au 
electrode and dried in the ambient environment; the electrode was later covered 
with 211 of lactate oxidase solution (40mg ml’) and finally 311 of the chitosan/ 
carbon nanotube solution. The sensor arrays were allowed to dry overnight at 4°C 
with no light. The solutions were stored at 4°C when not in use. 

Signal conditioning, processing and wireless transmission circuit design. The 
circuit diagram of the analogue signal-conditioning block of the FISA is shown in 
Extended Data Fig. 4. At the core of our system we used an ATmega328P (Atmel 
8-bit) microcontroller that could be programmed on-board through an in-circuit 
serial programming interface. This microcontroller is compatible with the popular 
Arduino development environment, and is commonly used in autonomous systems 
with low power and low cost requirements. By exploiting the microcontroller’s 
built-in 10-bit analogue-to-digital converter block as well as its computational and 
serial communication capability, we relayed the signals (as transduced by our sensor 
module and as conditioned by our analogue circuitry) to the Bluetooth transceiver. 

The conditioning path for each sensor was implemented in relation to the 
corresponding sensing mode. In the case of the amperometric-based glucose 
and lactate sensors, the originally generated signal was in the form of electrical 
current. Therefore, in the respective signal conditioning paths, we first used a 
transimpedance amplifier stage to convert the signal current into voltage. In our 
electrical current measurements, the direction of the current was from the shared 
Ag/AgCl reference/counter electrode towards the working electrode of each of 
the glucose and lactate sensors, which would result in a negative transimpedance 
output voltage. Hence, for both glucose and lactate paths, the transimpedance 
amplifiers were followed by inverter stages to make the respective voltage signals 
positive, since the analogue-to-digital converter stage took only positive input 
values. The feedback resistors in each of the transimpedance sections was cho- 
sen (1 MQ for the glucose path and 0.5 MQ for the lactate path) such that the 
converted voltage signal could be finely resolved, while staying within the input 
voltage range of the analogue-to-digital converter stage of the microcontroller. The 
current sensing signal paths were capable of measuring current levels as low as 
1nA, which was much lower than the minimum signal in our measurements (tens 
of nanoamperes). In this implementation, with the transimpedance amplifier at the 
front-end, the Ag/AgCl reference/counter electrode of the amperometric-based 
sensors needed to be grounded. This requirement prevented us from grounding the 
shared PVB reference electrodes in the potentiometric-based sensors, because the 
potential difference between the Ag/AgCl reference and PVB electrodes changes 
in the presence of different chloride ion concentrations (Extended Data Fig. 2b). 
In the case of the ISE-based sensors, the generated signals were essentially the 
voltage differences between the PVB-coated shared reference electrode and 
the working electrode of the respective sensors. Therefore, without grounding the 
PVB electrode, we measured the difference in potential of the floating ISE working 
and shared electrodes directly. To this end, the signal conditioning paths of the 
potentiometric-based sensors included a voltage buffer interfacing the respective 
working and reference electrodes, followed by a differential amplifier to effec- 
tively implement an instrumentation amplifier configuration. With this approach 
we ensured that the voltage-sensing and current-sensing paths were electrically 
isolated. Furthermore, the differential sensing stage also helped to minimize the 
unwanted common-mode interferences which would have otherwise degraded the 
fidelity of our sensor readings. Also, the high impedance nature of the ISE-based 
sensors*” required the use of high-impedance voltage buffers to ensure accurate 
open voltage measurement as intended. 

All the analogue signal conditioning paths concluded with a correspond- 
ing unity gain four-pole low pass filter, each with a —3-dB frequency at 1 Hz to 
minimize the noise and interference in our measurements. The choice of using 
active filters in our system also gave us flexibility in tuning the gain in our signal- 
conditioning path if needed. The low pass filters were connected to the ana- 
logue-to-digital converter stage of the microcontroller, to facilitate the conversion 
of the filtered analogue signals to their respective digital forms. In our imple- 
mentation, each of the analogue signal conditioning paths were electrically char- 
acterized to validate the linear output response of the channels with respect to 
the corresponding electrical input signals mimicking the sensor output signals. 
For this characterization step, electrical current was applied as an input to the 
glucose and lactate channel terminals to model the respective amperometric-based 
sensor output and differential voltage was applied at the terminals of the sodium 
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and potassium channels to model the corresponding potentiometric-based sensor 
output. As illustrated in Extended Data Fig. 5a-d, all four signal-conditioning 
channels demonstrated an excellent linear response (correlation factor R= 1). To 
eliminate the non-ideal effects such as voltage offset and to obtain precise signal 
readings, the exact numerical linear relationship between output and input was 
obtained to map the original input signal to the analogue circuit readouts, which 
in turn allowed for subsequent signal calibration and processing at the software 
level. Upon processing and averaging the data, the microcontroller was exploited 
to relay the data to the Bluetooth module for wireless transmission. 

Power delivery to the FISA. The FISA was powered by a single rechargeable 
lithium-ion polymer battery with a nominal voltage of 3.7 V of a desired capacity 
(a representative 105-mAh battery is illustrated in Extended Data Fig. 5e and f). The 
protection circuitry included protects the battery against unwanted output shorts and 
over-charging. Step-up direct current/direct current converters were used to produce 
a fixed, regulated output of +5 V for the microcontroller and +3.3 V for the Bluetooth 
modules. This regulated output also served as the positive power supply for the ana- 
logue peripheral components. The negative power supply (—5 V) for the analogue 
peripheral components was implemented through the use of inverting charge pump 
direct current/direct current converters that produce negative regulated outputs. 
The custom mobile application design. A mobile application (the Perspiration 
Analysis App) was designed to accompany the FISA and to provide a user-friendly 
interface for data display and aggregation (Extended Data Fig. 3). To use this appli- 
cation, first, the user should wear the FISA and open the Perspiration Analysis App 
on the mobile device. The application establishes a secure Bluetooth connection to 
the FISA. Subsequently, it receives and displays the stream of data that are trans- 
mitted in real time from the FISA. The application is capable of plotting a graph of 
these values versus time during the user’s physical activities. The data and graphs 
can be stored on the device, uploaded to cloud servers online, and can be shared via 
social media. Additionally, the application keeps track of the duration of exercise 
as well as the distance travelled. Although the current implementation was pro- 
grammed in the Android environment, similar application interfaces could easily 
be developed in other popular mobile operating systems such as iOS. 

The characterization of the sensors. A set of electrochemical sensors was charac- 
terized to explore their reproducibility in solutions of target analytes. Extended Data 
Fig. 6a—d show that Na* and K* sensors had a relative standard deviation of ~1% in 
sensitivity while glucose and lactate sensors had a relative standard deviation of ~5% 
in sensitivity. However, there are differences in absolute potential values for ISEs in 
the same solution. Therefore, one-point calibration in a standard solution containing 
1mM KCland 10mM NaCl was performed for Na* and K* sensors before each use. 
The measured potential of ISEs in the standard solution was then set to zero by the 
microcontroller. (Such calibration is similar to what is done in commercial finger-stick 
glucose sensors.) No calibration was needed for the glucose and lactate sensors. Long- 
term stability of the sensors was also evaluated over a period of four weeks using five 
different sensor arrays each week (Extended Data Fig. 6e-h). It was observed that the 
Na* and K* sensors had approximately the same sensitivities of 62.5 mV and 59.5mV 
per decade of concentration, respectively, in ambient conditions. The sensitivities of 
the glucose and lactate sensors were similarly maintained within 5% of their original 
values over the four-week period when stored at 4°C. The glucose and lactate sen- 
sors were characterized chronoamperometrically using a Gamry Electrochemical 
Potentiostat (Fig. 2a and b). Owing to Faraday and capacitive currents*®, the responses 
of both sensors showed drift initially but stabilized within 1 min of the data recording. 
The in vitro temperature compensation experiments (Fig. 2h) were performed con- 
tinuously using the same sensor in four Petri dishes containing solutions at different 
temperatures on different hot plates. The convection and non-uniform distributions 
of solution temperature could result in noticeable noise in the signal measurements. 

For continuous use, all the sensors displayed excellent stability over the entire 
exercise period. The sensor array could be repeatedly used for continuous temper- 
ature and sweat electrolyte monitoring. However, the glucose and lactate responses 
degraded beyond the exercise period (after two hours) owing to decreased enzyme 
activity. The sensor-FPCB interface allows for convenient replacement of the fresh 
sensor arrays for subsequent use. 

Analysis of the effect of mechanical deformation on the sensors was performed 
by repeatedly bending the Na*, glucose sensors, and temperature sensors (radius 
of curvature, 1.5 cm) as well as the FPCB (radius of curvature, 3 cm) for a total of 
60 cycles (Extended Data Fig. 8). Performance of the sensors was recorded after 
every 30 cycles. Continuous measurement on sensor performance during bending 
and no bending was also performed. 

Ex situ evaluation of the sweat samples. Ex situ sensor performance was also 
conducted by testing sweat samples collected from the subjects’ foreheads. Sweat 
samples were collected every 2-4 min by scratching their foreheads with micro- 
tubes, and subjects’ foreheads were wiped and cleaned with gauze after every sweat 
collection’’. The changes of [Na*] and [K*] during euhydration and dehydra- 
tion trials were also studied ex situ in the same manner. The calibration of the 


sensor arrays was performed before ex situ measurements using artificial sweat 
containing 22 mM urea, 5.5 mM lactic acid, 3mM NH,*, 0.4mM Ca?*, 501M 
Mg?* and 251M uric acid with varying glucose concentrations of 0-200 1M, [K*] 
of 1-16mM and [Na‘] of 10-160 mM. 

The setup of FISA for on-body testing. A water-absorbent thin rayon pad was 
placed between the skin and the sensor array during on-body experiments to 
absorb and maintain sufficient sweat for stable and reliable sensor readings, and 
to prevent direct mechanical contact between the sensors and skin. The pad could 
absorb about 10,11 of sweat, which was sufficient to provide stable sensor read- 
ings. During on-body tests, the newly generated sweat would refill the pad and 
‘rinse away’ the old sweat. The on-body measurement results were also consistent 
with ex situ tests using freshly collected sweat samples. Assuming a single- 
centred flow model!?, the best-case sampling interval can be calculated to be less 
than 1 min, based on the sweat rate (~3-4mg min !cm”)*? and the pad size 
(1.5cm x 2cm x 501m). The intrinsic response time of FISA was smaller than 
the body’s response time to the changes in physiological conditions. An increase 
in temperature was observed when the ‘smart headband’ or ‘smart wristband’ was 
worn owing to the use of the plastic substrate on skin. Although this may result in 
a small error in measuring the actual skin temperature, it should be noted that this 
does not have an impact on the measurement of the electrolytes and metabolites, 
owing to the on-board temperature calibration. To ensure the fidelity of sensor 
readings further, the data collection of each channel took place when a sufficient 
sweat sample was present, as shown by the stabilization of the sensor readings 
(varying within 10% of the readings of the continuous five data points) within 
the physiologically relevant range: [Na*], 20-120 mM; [K*], 2-16 mM; glucose 
concentration, 0-200\.M; and lactate concentration, 2-30 mM. 

On-body sweat analysis. The on-body evaluation of the FISA was performed in 
compliance with the protocol that was approved by the institutional review board 
at the University of California, Berkeley (CPHS 2014-08-6636). 26 healthy subjects 
(4 females and 22 males), aged 20-40, were recruited from the University of 
California, Berkeley campus and the neighbouring community through advertise- 
ment by posted notices, word of mouth, and email distribution. All subjects gave 
written, informed consent before participation in the study. The study was con- 
ducted as three trials: constant workload cycle ergometry, graded workload cycle 
ergometry, and outdoor running. Constant workload cycle ergometry was con- 
ducted on 14 volunteers (4 females and 10 males between the ages of 20 and 40). 
The graded cycle ergometry was conducted on 7 male volunteers (who were also 
involved in the constant workload cycle study). 12 male volunteers between the 
ages of 20 and 40 were recruited for the outdoor running study. An electronically 
braked leg-cycle ergometer (Monark Ergomedic 839E, Monark Exercise AB) was 
used for cycling trials, which included real-time monitoring of heart rate, oxygen 
consumption (Vo), and pulmonary minute ventilation. The power output was 
calibrated and monitored through the ergometer. Heart rate was measured using 
a Tickr heart rate monitor (Wahoo fitness), and Vo, and minute ventilation were 
continuously recorded throughout trials via an open-circuit, automated, indirect 
calorimetry system (TrueOne metabolic system; ParvoMedics). The FISAs were 
packaged inside traditional sweatbands during the indoor and outdoor trials. The 
sensor arrays were calibrated, and the subjects’ foreheads and wrists were cleaned 
with alcohol swabs and gauze before sensors were worn on-body. For the constant 
workload cycling trial subjects were cycling at 50 W with 50-W increments every 
90s up to 150 W, and 20 min of cycling at 150 W. The power output was then 
decreased by 50 W every 90s. The graded workload trial consisted of 5 min of 
seated rest followed by cycling at 75 W for 20 min and then cycling at 200 W until 
fatigue followed by a 10-min rest. The outdoor running trial was conducted with 
a group of 12 subjects in which 6 were instructed to drink 150 ml water every 5 min 
and 6 did not drink water throughout the trial. Subjects consented to run until 
volitional fatigue at a self-selected pace (8-12 kmh) and the Na‘ and K* sensor 
responses (from their foreheads) were recorded. 
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a PET cleaning b Au patterning 


d O, plasma etching c Parylene deposition 


h Sensors modification 


Extended Data Figure 1 | Fabrication process of the flexible sensor the electrode areas. e, Electron-beam deposition of the Ag layer followed 
array. a, PET cleaning using acetone, isopropanol and O; plasma etching. by lift-off in acetone. f, Ag etching on the Au working electrode area and 

b, Patterning of Cr/Au electrodes using photolithography, electron- Ag chloridation on the reference electrode area. g, Optical image of the 
beam evaporation and lift-off in acetone. c, Parylene insulating layer flexible electrode array. h, Photograph of the multiplexed sensor array after 
deposition. d, Photolithography and O; plasma etching of parylene in surface modification. 
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Extended Data Figure 2 | The characterizations of the modified 
electrodes. a, Cyclic voltammetry of the amperometric glucose and lactate 
sensors using Prussian blue as a mediator in PBS (pH7.2). Scan range, 


—0.2 V to 0.5 V; scan rate, 50mVs~!. b, Potential stability of a PVB-coated 
Ag/AgCl electrode and a solid-state Ag/AgCl reference electrode (versus 
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commercial aqueous Ag/AgCl electrode) in different NaCl solutions. 
c, d, The stability of a PVB-coated reference electrode in solutions 
containing 50 mM NaCl and 10 mM of different anionic (c) and cationic 


(d) solutions. Data recording was paused for 30s for each solution 
change in b-d. 
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Extended Data Figure 3 | The custom-developed mobile application for data display and aggregation. a, The home page of the application after 
Bluetooth pairing. b, Real-time data display of sweat analyte levels as well as skin temperature during exercise. c, Real-time data progression of individual 
sensor. d, Available data sharing and uploading options. 
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Extended Data Figure 5 | The calibration and power delivery of the the current work (placed next to a quarter-dollar coin for comparison). 
FISA. a-d, Flexible PCB calibration for glucose (a), lactate (b), sodium g, Representative photograph of the power delivery package inside a 
(c) and potassium (d) channels. e, Power delivery diagram of the transparent wristband on a subject’s wrist. 


system. f, Photograph of a small rechargeable battery module used in 
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Extended Data Figure 6 | Reproducibility and long-term stability of the 
biosensors. a—d, The reproducibility of the sodium (a), potassium 
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Extended Data Figure 7 | Selectivity study for electrochemical 
biosensors. a—d, The interference study for individual glucose (a), lactate 
(b), sodium (c) and potassium (d) sensors using an electrochemical 
working station. Data recording was paused for 30s for the addition of 
each analyte in c and d. e, f, The real-time system-level interference study 


(e) and calibration plot (f) of the amperometric glucose and lactate sensor 
array with a shared solid-state Ag/AgCl reference electrode. g, h, The real- 
time interference study (g) and calibration plot (h) of the potentiometric 
Na* and K* sensor array with a shared PVB-coated reference electrode. 
Data recording was paused for 30s for each solution change in e and g. 
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Extended Data Figure 8 | Mechanical deformation study of the flexible potassium (h), glucose (i), lactate (j),and temperature (k) sensors and 
sensors and the FPCB. a-f, The responses of the sodium (a), potassium of the FPCB (1) during bending. The radii of curvature for the bending 
(b), glucose (c), lactate (d), temperature (e) sensors and of the FPCB (f) study of sensors and the FPCB were 1.5 cm and 3 cm, respectively. Data 
after 0, 30 and 60 cycles of bending. g-l, The responses of the sodium (g), recording was paused for 30s to change the conditions and settings. 
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Extended Data Figure 9 | On-body real-time perspiration analysis 
during stationary cycling using the FISA on a subject’s wrist. Conditions 
are as in Fig. 3c and d. 
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Extended Data Figure 10 | Ex situ measurement of collected sweat 
samples using the FISA on a subject during stationary cycling at 150 W. 
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Lithium-ion battery structure that self-heats at low 


temperatures 


Chao-Yang Wang!”, Guangsheng Zhang!, Shanhai Ge?, Terrence Xu’, Yan Ji?, Xiao-Guang Yang! & Yongjun Leng! 


Lithium-ion batteries suffer severe power loss at temperatures 
below zero degrees Celsius, limiting their use in applications such 
as electric cars in cold climates and high-altitude drones”. The 
practical consequences of such power loss are the need for larger, 
more expensive battery packs to perform engine cold cranking, 
slow charging in cold weather, restricted regenerative braking, 
and reduction of vehicle cruise range by as much as 40 per cent?. 
Previous attempts to improve the low-temperature performance 
of lithium-ion batteries* have focused on developing additives 
to improve the low-temperature behaviour of electrolytes™®, and 
on externally heating and insulating the cells’-°. Here we report 
a lithium-ion battery structure, the ‘all-climate battery’ cell, that 
heats itself up from below zero degrees Celsius without requiring 
external heating devices or electrolyte additives. The self-heating 
mechanism creates an electrochemical interface that is favourable 
for high discharge/charge power. We show that the internal 
warm-up of such a cell to zero degrees Celsius occurs within 
20 seconds at minus 20 degrees Celsius and within 30 seconds at 
minus 30 degrees Celsius, consuming only 3.8 per cent and 5.5 
per cent of cell capacity, respectively. The self-heated all-climate 
battery cell yields a discharge/regeneration power of 1,061/1,425 
watts per kilogram at a 50 per cent state of charge and at minus 
30 degrees Celsius, delivering 6.4-12.3 times the power of state- 
of-the-art lithium-ion cells. We expect the all-climate battery to 
enable engine stop-start technology capable of saving 5-10 per 
cent of the fuel for 80 million new vehicles manufactured every 
year’°. Given that only a small fraction of the battery energy is used 
for self-heating, we envisage that the all-climate battery cell may 
also prove useful for plug-in electric vehicles, robotics and space 
exploration applications. 

Figure 1a schematically shows a generic lithium (Li)-ion all-climate 
battery (ACB) cell. In addition to the three essential battery compo- 
nents—anode, cathode and electrolyte—we add here a fourth compo- 
nent: a nickel (Ni) foil 50 jm in thickness having two tabs, one at each 
end. Electrical resistance between the two tabs is designed to be 56mQ 
at room temperature (25°C) to keep the cell voltage around 2 V and 
to avoid solid-electrolyte interphase decomposition and copper foil 
oxidation. One tab is electrically connected to the negative terminal, 
welded together with the tabs of all anode layers. The second tab of the 
Ni foil extends outside the cell to form a third terminal, the activation 
terminal, used to activate battery internal heating at low temperatures. 
A switch connects the activation terminal with the negative terminal. 
When the switch is left open during cell activation for self-heating, 
electrons must flow through the Ni foil, generating substantial ohmic 
heat, which rapidly warms up the core of the battery. Once the bat- 
tery internal temperature reaches or exceeds 0°C, thereby enabling 
the electrochemical interface to generate high power for both dis- 
charge and charge, the activation process is completed and the switch 
is closed. When the ACB cell operates at around room temperature, the 
switch between the activation terminal and negative terminal remains 


closed, making electrons bypass the Ni foil and reverting the ACB cell 
to a conventional Li-ion cell with very low internal resistance and high 
power. The switch between activation terminal and negative terminal 
may be controlled by the cell surface temperature. 

Figure 1b shows cell voltage and surface temperature evolutions dur- 
ing cell activation followed by a 1C discharge of a 7.5 amp-hour (Ah) 
ACB cell at —20°C, and similar results are shown in Extended Data 
Fig. la and b for —30 °C and —40°C, respectively. The cell exhibits 
sufficiently high voltage to stay ‘healthy’ (that is, the battery materials 
do not suffer potential degradation) throughout activation and 1C 
discharge processes even from as low as —40°C. More noteworthy 
is that the cell surface temperature rises rapidly, in seconds, from an 
extremely cold environment to 0°C within the activation process (bet- 
ter seen in the insets to Fig. 1b and Extended Data Fig. 1, where the 
cell activation process is magnified). It is clear that cell activation takes 
only 19.5s, 29.6s and 42.5s from environments at —20°C, —30°C 
and —40°C, respectively. After activation, the cell surface temperature 
drops slightly below the freezing point owing to large heat loss to the 
cold surroundings in these environmental-chamber tests; however, in 
reality it would remain around the freezing point owing to the ther- 
mal insulation usually applied around cells. The 1C discharge energy, 
calculated by integrating the area underneath each discharge curve, 
is 102 watt-hours per kilogram (Wh kg~’) for the ACB cell at —40°C, 
compared to only 0.3 Whkg™! for the baseline cell without Ni foil. The 
ACB cell thus provides much more usable energy, enabling a longer 
cruising range for an electric car, especially in extreme cold. 

The ultrafast cell activation discovered in this work makes ACB 
cells technologically viable for boosting battery power. Fundamentally, 
the activation time may be estimated as follows. Assuming negligible 
heat loss from cell surfaces to the surroundings owing to a short time 
duration (this assumption is realistic as batteries are well insulated in 
vehicles), the energy balance during cell activation is: 


Tact 


Tact(Uo — Vact dt = mc, AT (1) 


0 
where Ic and Vact are current and output voltage during cell activa- 
tion, Up is the thermal equilibrium potential of a Li-ion cell (~4.2 V 
for the cells used in this study), Tact is activation time, m is cell mass, 
Cp is the specific heat of the cell, and AT is the rise in temperature 
from the initial ambient temperature to, for example, 0°C. Assuming 
Cp= 1,000Jkg7! K-l and using an average activation current I< of 
47.4 Ain the —30°C activation case (see Extended Data Fig. 2), the 
theoretical activation time Ta, is estimated to be 26.7 s, very close to the 
measured 29.6s. This also indicates that the self-heating mechanism 
devised in the ACB cell structure is very energy-efficient (~90% in this 
case). If Vact=0 V activation is implemented, one can convert 10% more 
electric energy into internal heat for battery warm-up from very low 
temperatures, thereby further shortening activation time. This is the 
greatest advantage of ACB cells over existing battery heating methods, 
which are much more energy- and time-consuming’ °. For example, 
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Figure 1 | The ACB. a, Schematic in which a metal foil is inserted to 
generate internal heating from a low temperature and to provide fast 
heat transfer to electrodes and electrolyte. This self-heating function is 
activated by turning off the switch between the activation terminal and 
the negative terminal. b, Cell voltage and temperature evolutions during 


Vlahinos and Pesaran’ computationally showed that battery core 
heating based on the cell’s internal resistance is more effective than 
external heating methods. Stuart and Handeb® argued that direct- 
current internal heating is ineffective and instead implemented expen- 
sive, heavy alternating-current generators for heating. More recently, 
Jiand Wang? thoroughly reviewed a wide range of heating strategies 
for Li-ion batteries and demonstrated that self-resistive heating from 
—20°C to 20°C takes ~120s and consumes ~15% battery energy. For 
heating from —20°C to 0°C as in the present context, their cell would 
require a 60-s heating time and 7.5% energy consumption, much less 
efficient than the present ACB cell. 
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Figure 2 | Power performance of the ACB cell. a, 10-s HPPC specific 
power versus depth of discharge, compared to the baseline cell for 
—20°C, —30°C and —40°C. At 50% SOC, the ACB cell delivers 2.7 times, 
6.4 times and 25.1 times the discharge power and 5.1 times, 12.3 times 
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Vact = 0.4 V activation (inset) and subsequent 1C discharge at —20°C. 
The battery temperature rises from —20°C to 0°C in ~20s and the 1C 
discharge thereafter occurs at the ~0°C battery core temperature rather 
than the —20°C ambient temperature. 


Another important feature of the ACB cell is high power, imme- 
diately available after ultrafast activation just as the battery materi- 
als and electrochemical interfaces reach 0°C. In Fig. 2a, for —20°C, 
—30°C and —40°C, a 10-s hybrid pulse power characterization 
(HPPC) power in watts per kilogram, for both discharge and regen- 
eration (charge), as a function of depth of discharge is compared 
to that of a conventional Li-ion cell without Ni foil. At 50% state- 
of-charge (SOC) or depth of discharge, the power boost over the 
conventional Li-ion cell is 2.7, 6.4 and 25.1 for —20°C, —30°C 
and —40°C, respectively, for discharge, and 5.1, 12.3 and 55 for 
regeneration. Figure 2b plots the specific power versus ambient 
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versus the baseline as function of ambient temperature for 50% and 80% 
SOC. 
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Figure 3 | Power on demand at 50% SOC for 10-s HPPC at —30°C. 
a, Normalized power (ACB/baseline) versus relative activation time. To 
is the time of full activation. b, Relative activation time and percentage 


temperature for both ACB and baseline cells at 50% and 80% SOC, 
respectively. The discharge power (black lines in Fig. 2b) of the ACB 
cell is improved to 1,061 Wkg"! and 1,600 W kg! at —30°C for 50% 
and 80% SOC, respectively. These power levels are more than 5-6 
times the power of the baseline Li-ion cell at the same temperature. 
Regeneration power at low temperatures is equally impressive for the 
ACB cell, reaching 1,425 W kg™! at 50% SOC and 650 Wkg™! at 80% 
SOC at —30°C, indicative of unprecedented high charge/regeneration 
power in the extreme cold. These high power capabilities, readily 
available after a short activation, open new possibilities for a wide 
variety of applications where high battery power is critically sought. A 
few examples are the expedient capture of braking energy in extreme 
cold where it is most needed, weather-independent fast charging, and 
high-flying drones at low atmospheric temperatures. 

For ACB cells to enjoy a dramatic power boost at low tempera- 
tures, some activation time and energy (or charge) consumption are 
required. Both can be further managed by exercising a power-on- 
demand strategy, that is, implementing partial activation to attain 
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Figure 4 | ACB cell durability. a, C/3 capacity retention. In the present 
context, C/3 = (7.5 A)/3 =2.5 A discharge current. b, 1C charge/discharge 
curves of fresh cells and cells aged from 45 °C cycling between 2.8 V and 
4.15 V. Both capacity retention and charge/discharge curves in a and b are 
obtained during cell characterization at 25°C. ACB cells give rise to almost 
no side effects in high-temperature cycling. c, Cell surface temperature 
versus time in a series of ten consecutive cycles of activation and cool- 
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capacity consumption due to activation as functions of normalized power. 
5.5% energy (the right y-axis) can be exchanged, on demand, for 640% 
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a smaller but sufficient power boost. Such experiments are shown in 
Fig. 3a and b under the conditions of —30°C and 50% SOC, where 
normalized power, that is, the power of the ACB cell over the base- 
line, is plotted against the relative activation time. Obviously, for zero 
activation time or no activation at all, the normalized power is unity. 
At 100% activation, the normalized power for the ACB cell at —30°C 
reaches 6.4 times the discharge power and 12.3 times the regenera- 
tion power, respectively. However, at a partial activation such as 50%, 
there are already marked increases in both discharge and regeneration 
power, as can be seen from Fig. 3a. Figure 3b re-plots the relative acti- 
vation time and capacity consumption percentage due to activation 
against the normalized power. It is seen that for a 3-fold power boost, 
66% partial activation suffices and the capacity consumption due to 
cell activation is only 3.2% for heating from the ambient temperature 
of —30°C. Therefore, the power-on-demand strategy further reduces 
the activation time from 30s to 20s and also the capacity consumption 
due to activation from 5.5% to 3.2%, at the expense of having 3-fold 
power instead of 6.4-fold power. 
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down in durability of repetitive activations from Tmt = —30°C. The 
change in SOC during the ten cycles is also indicated. d, 1C capacity 
versus number of activations for Tap = 25°C. The constant-current, 
constant-voltage charge protocol is constant current at 1C followed 

by constant voltage at 4.2 V and terminated when the charge current 
diminishes to C/20. Little degradation exists, even after 500 activations 
from —30°C, 
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To project a further large reduction in activation time, Tact, and 
capacity consumption, Q,/Q,, for future energy-dense electric- 
vehicle batteries, it follows from equation (1) that if the C-rate, 3 
(which is the dimensionless electric current, relative to the cell capacity, 
such that in the present context, 1C for a 7.5-Ah cell is equivalent to a 
discharge current of 7.5 A), during activation is kept constant, the acti- 
vation time and percentage capacity consumption are proportional to: 


mc,AT V 
poe ee ee (2) 
Qo(Uo = Vact)B Up — Vact Pe 


Qact = mcpAT x V 1 
Qo Qo(Up _ Vact ) Uy _ 


3) 
Vact Pa ( 


where /, is the cell’s energy density and V is the nominal voltage used 
in calculating the energy density. 

Note that both activation time and capacity consumption will 
drop by half if the energy density is doubled from the current level of 
170 Whkg7! to a future level of 340 Wh kg~!. The halved activation 
time and capacity consumption would be at levels of ~10s and 1.9% 
for ACB cell activation from —20°C, a common low-temperature envi- 
ronment. Further implementing a partial activation based on power 
demand, it would be possible to keep the ACB activation time within 
5s and capacity consumption within 1%, while still delivering suffi- 
cient power for a wide range of applications involving cold climates. 

Existing techniques improve low-temperature power at the expense 
of greatly deteriorated performance and lifespan at high temperatures. 
Figure 4a and b shows no additional side effects with high-temperature 
cycling, and a normal 17% loss of capacity with room-temperature 
cycling over 2,000 cycles (Extended Data Fig. 3). 

Finally, we explore cell degradation caused by repetitively activat- 
ing an ACB cell from —30°C followed by forced air cool-down. A 
consecutive ten cycles of activation and cool-down is carried out for 
an ACB cell, starting with 100% SOC (Fig. 4c). Thereafter, the cell is 
charged back to 100% SOC at room temperature and with standard 
constant-current, constant-voltage protocol having a voltage limit 
of 4.2 V. A total of 500 such activations are carried out, and the cell 
capacity at room temperature and 1C rate characterized at the end 
of every ten activation/cool-down cycles is shown in Fig. 4d. The 
cell capacity fade is less than 7.2% at the end of 500 activations. In a 
practical electric-vehicle battery pack, once all batteries are heated to 
a higher temperature, the cool-down timescale usually ranges from 
several hours to 10-15h. This implies that cell activation is proba- 
bly only needed once per day. Assuming 30 days of extreme weather 
(—30°C) per year, 500 activations tested in the durability experiment 
shown in Fig. 4d would be equivalent to about 16 years of operation, 
meaning battery life is not noticeably decreased by cell activation 
from subfreezing temperatures. 
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The new material we added into the baseline battery to make it an 
ACB cell—that is, the Ni foil—weighs about 100 g per kilowatt-hour 
battery and costs US$0.1 per kilowatt-hour based upon a Ni price of 
US$10 per kilogram. Compared to the current best specific energy 
of Li-ion battery systems, which is 150 Wh per kilogram of the bat- 
tery system, and assuming a battery cost of US$250 per kilowatt-hour 
(ref. 11), the added weight and cost due to ACB technology are 1.5% 
and 0.04% of those of the baseline battery. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 

We fabricate 7.5-Ah ACB pouch cells using LiNip. sCoo.2Mno 302 (Umicore) as 
cathodes and graphite (Nippon Carbon) as anodes with 1 M of LiPF¢ dissolved in 
ethylene carbonate/ethyl methyl carbonate (3:7 by weight) + 2% vinylene carbonate 
as electrolyte (materials from BASF). The capacity ratio of negative to positive 
electrode is designed to be 1.2. The 7.5-Ah pouch cell contains a stack of 26 anode 
and 25 cathode layers. A Celgard-2325 separator of thickness 251m is used. A Ni 
foil sized at 56 mQ. at room temperature is coated with a thin backing material 
of polyethylene terephthalate (281m) for electrical insulation and sandwiched 
between two single-sided anode layers and the three-layer assembly then stacked 
in the centre of the cell. 

The cathodes are prepared by coating N-methylpyrrolidone-based slurry onto 
15-\1m-thick Al foil, whose dry material consists of NCM523 (92 wt%), Super-P 
(Timcal) (4.wt%) and polyvinylidene fluoride (Arkema) (4 wt%) as a binder. The 
anodes are prepared by coating deionized water-based slurry onto 10-jum-thick Cu 
foil, whose dry material consists of graphite (97.5 wt%), styrene butadiene rubber 
(Zeon) (1.5 wt%) and carboxymethy] cellulose (Dai-Ichi Kogyo Seiyaku) (1 wt%). 

Each ACB pouch cell has a 152mm x 75mm footprint area, weighs 160g, and 
has a nominal capacity of 7.5 Ah with a specific energy of 170 Whkg~! and an 
energy density of 327 Wh per litre. The discharge performance of the ACB cell 
at room temperature without activation is shown in Extended Data Fig. 4 as a 
function of the C-rate. 


LETTER 


We denote the voltage between the positive and negative terminals as cell 
voltage, a potential window encompassing all battery materials. Additionally, 
we denote the voltage between the positive and activation terminals as Vc for 
the activation process only. For any subzero operation, cell activation is first 
carried out by a constant-voltage, constant-current protocol where constant volt- 
age means that V,¢ is set at 0.4 V until the current reaches and is limited at 60 A 
(that is, 8C). Cell activation is terminated when the cell temperature reaches 
—5°C as measured by a thermocouple placed at the centre of the cell’s outer 
surface. A 10-s rest is given between the end of activation and cell loading for 
equilibrium, during which the cell surface temperature usually continues to rise 
to 0°C. Hence, cell activation described in the present work is designed to bring 
the battery core temperature to or above the freezing point from any subzero 
ambient environment. Prior to any subfreezing tests, an ACB cell is soaked in 
the environmental chamber for 8-12h to reach thermal equilibrium with the 
ambient temperature. 

Two types of cell discharge are performed in the present work. One is 1C dis- 
charge with a cutoff voltage of 2.8 V and the other is 10-s HPPC in which at a given 
SOC level, a 10-s charge pulse is applied at Vinax = 4.2 V, followed by a 40-s rest and 
a discharge pulse at Vinin = 2.8 V. The discharge and charge (or regeneration) power, 
in watts per kilogram of the battery cell is calculated as the product of constant 
voltage and average current in the 10-s discharge and charge pulses, then divided 
by the cell weight. 
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Extended Data Figure 1 | Cell voltage and temperature evolution during activation and subsequent 1C discharge. a, —30°C. b, —40°C. The insets 
show the Vact = 0.4 V activation more clearly. 
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Extended Data Figure 2 | Cell current variations during activation. a, —20°C. b, —30°C. c, —40°C. d, Activation time 7,,; and average activation 
current I,<¢ versus the ambient temperature Tamb. 
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Extended Data Figure 3 | 1C charge/2C discharge cycling of ACB cell at room temperature between 2.8 V and 4.2 V. a, C/3 capacity retention. 
b, 1C charge/discharge curves of the fresh and aged cells. 
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Extended Data Figure 4 | ACB cell discharge with various C-rates of 
discharge and at room temperature. 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


doi:10.1038/nature16453 


No iron fertilization in the equatorial Pacific Ocean 


during the last ice age 


K. M. Costa!?, J. RF McManus!”, R. F. Anderson!, H. Ren’, D. M. Sigman‘, G. Winckler!?, M. Q. Fleisher!, F. Marcantonio® & 


A.C. Ravelo® 


The equatorial Pacific Ocean is one of the major high-nutrient, 
low-chlorophyll regions in the global ocean. In such regions, the 
consumption of the available macro-nutrients such as nitrate and 
phosphate is thought to be limited in part by the low abundance 
of the critical micro-nutrient iron'!. Greater atmospheric dust 
deposition? could have fertilized the equatorial Pacific with iron 
during the last ice age—the Last Glacial Period (LGP)—but the 
effect of increased ice-age dust fluxes on primary productivity in the 
equatorial Pacific remains uncertain**. Here we present meridional 
transects of dust (derived from the *”Th proxy), phytoplankton 
productivity (using opal, #!Pa/?°Th and excess Ba), and the degree 
of nitrate consumption (using foraminifera-bound §!5N) from six 
cores in the central equatorial Pacific for the Holocene (0-10,000 
years ago) and the LGP (17,000-27,000 years ago). We find that, 
although dust deposition in the central equatorial Pacific was two 
to three times greater in the LGP than in the Holocene, productivity 
was the same or lower, and the degree of nitrate consumption was 
the same. These biogeochemical findings suggest that the relatively 
greater ice-age dust fluxes were not large enough to provide 
substantial iron fertilization to the central equatorial Pacific. This 
may have been because the absolute rate of dust deposition in the 
LGP (although greater than the Holocene rate) was very low. The 
lower productivity coupled with unchanged nitrate consumption 
suggests that the subsurface major nutrient concentrations were 
lower in the central equatorial Pacific during the LGP. As these 
nutrients are today dominantly sourced from the Subantarctic 
Zone of the Southern Ocean, we propose that the central equatorial 
Pacific data are consistent with more nutrient consumption in the 
Subantarctic Zone, possibly owing to iron fertilization as a result 
of higher absolute dust fluxes in this region””*. Thus, ice-age iron 
fertilization in the Subantarctic Zone would have ultimately worked 
to lower, not raise, equatorial Pacific productivity. 

The major nutrients for phytoplankton growth (nitrogen, phospho- 
rus and silicon) are supplied to the surface waters of the equatorial 
Pacific by wind-driven upwelling along the Equator. Their consump- 
tion by phytoplankton is thought to be limited in part by the low 
concentrations of the critical micro-nutrient iron!. Successful iron 
fertilization experiments in the modern ocean? have demonstrated 
the sensitivity of these regions to changes in the micro-nutrient sup- 
ply. Dust dissolution is one source of iron to the ocean, and globally 
increased dust fluxes” may have caused natural iron fertilization during 
the peak of the LGP. There is evidence for iron fertilization® in the 
Subantarctic Zone of the Southern Ocean, and the associated carbon 
storage in the deep ocean may have been responsible for almost half of 
the carbon dioxide drawdown during the LGP!°. However, the effects 
of increased ice-age dust fluxes on the equatorial Pacific are debated*, 
with arguments both for and against iron fertilization, particularly in 
the eastern equatorial Pacific. 


Here we present new proxy data on dust flux (**’Th flux, see 
Methods), biological productivity (‘export production, the export 
of organic matter out of surface water, as reconstructed from the 
opal flux, excess barium flux, and 7*!Pa/?°°Th, for which 7°°Th and 
?3!Pa represent excess initial °°Th and 77'Pa, respectively) and 
the degree of nitrate consumption (foraminifera-bound §'°N) 
from a north-south transect of six cores from the central equato- 
rial Pacific (0.22°S to 6.83°N, 156°-161° W; Extended Data Fig. 1) 
at two time slices: the Holocene (0-10,000 years ago) and the LGP 
(17,000-27,000 years ago). The relatively shallow water depths 
(average ~3,000 m) result in low rates of carbonate dissolution and 
permit the development of robust foraminifera-based radiocarbon age 
models (Extended Data Fig. 2, Extended Data Table 1). Furthermore, 
these core sites are far from the eastern continental margins, and so 
?32T'h at these sites predominantly reflects the flux of airborne dust 
particles”. Central equatorial Pacific surface waters are dominantly 
sourced with nitrate from the Equatorial Undercurrent, which origi- 
nates in the west!!. Thus, relative to the tropical Pacific as whole, the 
8'°N of the nitrate supply in the central equatorial Pacific is unlikely 
to be particularly sensitive to changes in eastern Pacific denitrification 
(see Methods). 

Because the central equatorial Pacific is far from dust sources, recon- 
structed dust fluxes are among the lowest ever measured!*. Core-top 
dust fluxes along the 160° W transect average 11.0mgcm~*kyr7', 
with a maximum of 12.8mgcm~kyr~‘ at 2.46° N (Fig. la). There is a 
weak decline in dust flux with increasing latitude (°° =0.42, P=0.88), 
with the lowest dust flux (8.8 mgcm~*kyr7!) at the most northerly 
core. This negative correlation is in contrast to more easterly (110° W, 
140° W) meridional transects, where the highest dust fluxes occur 
at the more northerly cores’. Relative to the Holocene, ice-age dust 
fluxes are two to three times greater along the 160° W transect, aver- 
aging 28.6mgcm “kyr! with a maximum of 32.2mgcm~*kyr~|, at 
2.46° N. The dust fluxes are remarkably constant as a function of 
latitude. Overall, the greater dust fluxes during the LGP are consistent 
with other reconstructions across the equatorial Pacific, which find 
glacial dust fluxes 0.7 to 3.4 times those of the Holocene (Fig. 2). 

However, the expectations of ice-age iron fertilization do not corre- 
spond with the observed changes in surface productivity (as determined 
from opal flux, excess barium flux, 231pa/?3°Th; see Methods). Core-top 
opal fluxes along the transect at 160° W average 47 mgcm~kyr~! and 
are negatively correlated with latitude (7? =0.90, P=0.31) (Fig. 1b). The 
maximum opal flux (70 mgcm~kyr7!) occurs at the Equator, which 
is consistent with higher surface productivity within the equatorial 
upwelling zone. Compared to the core-top fluxes, glacial opal fluxes 
are mostly lower, averaging 37 mgcm *kyr~', a finding that is incon- 
sistent with the expectations of local iron fertilization. Glacial fluxes 
also diminish northward from the Equator, consistent with a stable 
position for the upwelling. 
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USA. 3Department of Geosciences, National Taiwan University, Taipei 106, Taiwan. “Department of Geosciences, Princeton University, Princeton, New Jersey 08544, USA. Department of Geology 
and Geophysics, Texas A&M University, College Station, Texas 77843, USA. °Ocean Sciences Department, University of California, Santa Cruz, California 95064, USA. 
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Figure 1 | 7°°Th-normalized dust flux (see Methods), opal flux, 
231pa/?3°Th, excess Ba flux, and FB-5°N. a-e, Samples from the 
Holocene (0-10,000 years ago) are orange; samples from the LGP 
(17,000-27,000 years ago) are blue. The magnitude of variability in 
dust fluxes within the Holocene (nm = 30) and within the LGP (n= 31) 


The glacial—interglacial productivity signal is corroborated by initial 
23!Pa/*°Th, an opal flux proxy that has the advantage of being insen- 
sitive to remineralization!*, and excess barium (Ba) fluxes, an inde- 
pendent proxy for total export production’. Relative to the Holocene, 
23!Pa/*°°Th ratios and excess Ba fluxes are generally lower during the 
LGP (Fig. 1c, d), except for the northernmost site outside the upwelling 
zone. The consistent latitudinal trends, with the highest productiv- 
ity at the Equator and decreasing productivity northward, suggest 
no major difference in the upwelling regime between the Holocene 
and the LGP. Furthermore, measured 7*!Pa/?°°Th ratios are constant 
throughout each time slice (Extended Data Fig. 3), indicating that 
our results are not biased by bioturbation of transient extreme values 


(mg cm kyr) 
is negligible compared to the more than doubling of dust fluxes in the 
LGP compared to the Holocene. Error bars are 2c and indicate analytical 
precision. Error bars on FB-5!°N reflect the variation of samples within the 
time slice (indicating 20). Data for this figure are given in the Source Data 
for Fig. 1. 


(for example, a deglacial productivity maximum'®). The positive cor- 
relation between the excess Ba fluxes, *°'Pa/?°Th, and opal fluxes 
(Extended Data Fig. 6) demonstrates that, in this region, siliceous 
productivity is a proxy for total export production. Accordingly, both 
core-top and glacial *'Pa/*?°Th from across the equatorial Pacific?!” 
(Fig. 2) indicate that lower surface productivity was a regional phe- 
nomenon during the LGP. 

Across much of the global ocean, including most of the tropics and 
subtropics, the supply of major nutrients limits productivity. Iron 
limitation generally applies in regions with upwelling or deep mix- 
ing, where the major nutrient supply is increased but (given the low 
iron-to-major-nutrient concentration ratios of most deep waters) iron 


Latitude (°N) 


[NO;] 
(umol per litre) 


Dust flux 
(mg cm-? kyr) 


0.204 a 
é o ft 
8 0.154 \ = 
8 =| Bey 
= 0.104 
6 < B u +o 4 
R | —t 

0.05 7 Gg 

140°E 160°E 180° 160°W 140°W 120°W 100°W  80°W 


Longitude 


Figure 2 | Dust flux and **'Pa/*°Th across the equatorial Pacific. a, Map 
of annual average surface nitrate concentrations*’. Black circles indicate 
the core locations, and the black box identifies the six new cores presented 
in this study. b, Dust flux inferred from *?°Th-normalized 7**Th fluxes 
(circles) and lithogenic accumulation rates (diamonds) for the Holocene 
(0-10,000 years ago, orange symbols) and the LGP (17,000-27,000 years 
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ago, blue symbols). Meridional transects at 110° W and 140° W are offset 
longitudinally to clarify the Holocene and LGP time slices. ¢, *?’Pa/?°Th 
for the Holocene (orange squares) and the LGP (blue squares). The dashed 
line represents the production ratio (0.093; see Methods). Data and 
references for this figure are given in the Source Data for Fig. 2. 
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supply is not equivalently increased, leading to incomplete consump- 
tion of the major nutrients. Therefore, iron fertilization by dust could 
drive an increase in the degree of major nutrient consumption in the 
upwelled surface waters of the equatorial Pacific. The nitrogen iso- 
topic ratio of organic matter bound within the shell walls of planktonic 
foraminifera (foraminifera-bound §°N, FB-§!°N) covaries with the 
degree of nitrate consumption’, and here we present FB-6°N data from 
two euphotic-zone species (Globigerinoides ruber and G. sacculifer) and 
one deeper-dwelling species (G. tumida). In both the Holocene and 
the LGP, FB-6°N differences among the species parallel depth habitat 
(Fig. le), consistent with previous findings of higher FB-5!°N in 
deeper-dwelling species!®. For each species, FB-5!°N increases north- 
ward from the Equator, reflecting the progressive consumption of 
nitrate as surface waters flow away from the centre of upwelling, with 
the FB-§!°N rise accelerating north of 3° N, as expected when nitrate 
consumption approaches completion”. For each species, core-top and 
glacial FB-6'°N are remarkably similar across the meridional transect 
of cores. The lack of any difference between LGP and Holocene 
values argues strongly against glacial iron fertilization, which would 
be expected both to raise FB-8!°N over the equatorial upwelling and 
to steepen the northward FB-8!°N increase. 

The unchanged relationship between 73!Pa/**°Th and opal flux 
(Extended Data Fig. 6) also supports sustained iron limitation. 
Silicification of diatom frustules is thought to be sensitive to iron 
stress®, with more robust and, therefore, better preserved frustules 
produced under conditions of greater iron limitation. The constancy 
of the relationship between opal flux and 77!Pa/**°Th (Extended 
Data Fig. 6), which would be sensitive to changes in opal preser- 
vation, indicates that any change in silicification due to iron stress 
between the LGP and Holocene was too small to alter this relationship 
measurably. 

The lack of ice-age iron fertilization in the equatorial Pacific contrasts 
with its evident occurrence in the Southern Ocean”*. This dissimilarity 
may be due to three different factors related to dust fluxes and their 
fertilizing effects. First, the absolute fluxes of dust to the equatorial 
Pacific are so low in the Holocene that a doubling or even tripling 
of these fluxes during the LGP would still have amounted to only a 
minor atmospheric input of iron. Ice-age dust fluxes in the Subantarctic 
Zone of the Southern Ocean (500-700 mgcm kyr~!)? were 20-30 
times greater than ice-age dust fluxes in the central equatorial Pacific 
(20-30 mgcm~’kyr7'), suggesting that glacial dust in the equatorial 
Pacific remained roughly an order of magnitude too low to relieve iron 
limitation. Second, iron in equatorial Pacific dust may be less biologi- 
cally accessible than the iron in Southern Ocean dust, given that recent 
work on mineralogical solubility of dust indicates that dust derived 
from glacial till (as was deposited in the ice-age Southern Ocean) is 
two to three times more soluble than dust derived from arid soils (as 
was deposited in the equatorial Pacific)”°. Third, the concentration of 
iron upwelled from the Equatorial Undercurrent, the principal regional 
iron source”', is augmented by partial dissolution of fluvial particles 
eroded from Papua New Guinea”. Regionally dry conditions during 
the LGP*$ may have reduced erosion and river transport rates, resulting 
in lower iron concentrations in the Equatorial Undercurrent that may 
have countered the small increase in atmospheric deposition of iron, 
although a full investigation into this negative feedback is beyond the 
scope of this study. 

The combination of lower productivity and the constant degree 
of nitrate consumption suggests that the nitrate supply in the equa- 
torial Pacific was lower in the LGP than in the Holocene. Reduced 
nitrate supply could be the result of (1) lower nitrate concentrations in 
upwelling waters, (2) a deeper thermocline, and (3) lower upwelling 
rates. Coupled sea surface temperature and productivity records 
from the eastern equatorial Pacific argue against large differences in 
upwelling rates between the LGP and the late Holocene**. Moreover, in 
the modern central equatorial Pacific, the degree of nitrate consump- 
tion is observed to rise during the springtime minimum in upwelling 
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Figure 3 | Changes in nutrient dynamics between the Holocene and 
the LGP. General circulation shown as black arrows. AAIW, Antarctic 
Intermediate Water. During the LGP, a 4—-6-fold increase in dust supply 
(red circles) stimulated biological productivity (green circles and 
arrows) and enhanced the degree of nutrient consumption in the SAMW 
formation regions to levels much greater than during the Holocene”*. 
Greater glacial productivity left a smaller inventory of nutrients to be 
subducted into the thermocline during SAMW formation, thereby 
lowering the supply of nutrients to equatorial upwelling regions. 


rate'!, and palaeoceanographic data suggest that it also rose during 
orbitally driven phases of lower upwelling”». 

Conversely, there is evidence to support both lower major nutrient 
concentrations in the water upwelled in the equatorial Pacific”® and 
a deeper thermocline”’ during the LGP. The aforementioned dust- 
driven enhancement of nitrate consumption in Subantarctic Zone 
surface waters would have reduced the nitrate concentrations exported 
in Subantarctic Mode Water (SAMW) to the low latitudes”® (Fig. 3). 
These lower nitrate concentrations in upwelled waters can explain a 
decline in productivity in spite of a constant degree of nitrate consump- 
tion, especially if combined with a deeper thermocline, which would 
inhibit the entrainment of nutrient-rich waters into the upwelling zone. 
Furthermore, the constant degree of nitrate consumption suggests that 
the iron content of the upwelled water was lower during the LGP by the 
same proportion that major nutrients were lower relative to Holocene 
concentrations. This parallel response may indicate that the iron con- 
tent of shallow subsurface waters maintains a relatively constant ratio 
with the major nutrients, through a conserved iron-to-major-nutrient 
ratio in sinking organic matter”. 

This study demonstrates that there was no enhancement of nitrate 
uptake by dust (iron) fertilization in the equatorial Pacific during 
the LGP. However, glacial-interglacial changes in dust flux do affect 
the equatorial Pacific through changes in global nutrient dynamics. 
Iron fertilization in the Southern Ocean during the LGP generated a 
higher degree of nutrient consumption, higher surface productivity, 
and lower nutrient export in the SAMW-formation regions”§, which 
reduced the nutrient supply to the equatorial Pacific compared to the 
Holocene. The inverse correlation between the Subantarctic Zone of the 
Southern Ocean and equatorial Pacific surface productivity on glacial- 
interglacial timescales implies that increases in Southern Ocean 
productivity occur at the expense of equatorial Pacific productivity, 
a compensating mechanism by which globally integrated ocean 
productivity would be stabilized over glacial cycles. 
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METHODS 


Site location and chronology. The cores used in this study were collected aboard 
the RV Marcus G. Langseth on a dedicated cruise to the Line Islands in May 
2012. The sites range from just south of the Equator (0.22 °S) to approximately 
7° N along the 159° W (+3°) meridian (Extended Data Fig. 1, Extended Data 
Table 1). Shallow core sites (3,545 m at the deepest) suggest minimal carbonate 
dissolution, with carbonate concentrations ranging from 83-98 wt%. Core chro- 
nologies were established with four radiocarbon dates on G. ruber: 0 cm and 8cm 
depth in multicores, and two depths in the Big Bertha piston cores that bracketed 
the §'°O maximum inferred to represent Marine Isotope Stage 2 (ref. 31). Analyses 
were performed at the National Ocean Sciences Accelerator Mass Spectrometry 
Facility at Woods Hole Oceanographic Institution and at the Lawrence Livermore 
National Laboratory. Radiocarbon ages were calibrated to calendar years using 
Calib 7.0 Marine13**?. Age models were established via linear interpolation 
between radiocarbon dates (Extended Data Fig. 2, Extended Data Table 2). Core- 
top ages greater than zero are caused by bioturbation of the surficial sediment, 
resulting in inflated sedimentation rates in the multicores. Because our samples 
were selected based on §!8O before radiocarbon dating, not all data points fall 
within the EPILOG definition of the Last Glacial Maximum (the peak of the LGP) 
of 18,000-24,000 years ago™. However, amending the data to only those within 
that time frame (m= 21 of 31) has no substantial impact on the interpretations 
presented here. 

Analytical methods. Approximately five samples from the multicores 
(Holocene) and five samples from the Big Bertha piston cores (LGP) from each 
of the six sites were analysed for thorium (?°Th, ?*°Th), uranium (72°U, 7°°U, 
2341), and protactinium (7*!Pa) by isotope dilution inductively coupled plasma 
mass spectrometry (ICP-MS). Samples (100 mg) were randomized, spiked with 
22°Th, 3°U and *3Pa, and processed with complete acid digestion and column 
chromatography*’. Isotopes were measured on an Element 2 ICP-MS at the 
Lamont-Doherty Earth Observatory of Columbia University. An internal sedi- 
ment standard (Line Islands MegaStandard, LIMS) was used for quality control, 
and replicates of LIMS (n= 15) indicate that measurements are reproducible 
within <5% on all isotopes. Excess initial 7°°Th and excess initial **!Pa values 
were calculated by correcting for supported decay (the fraction of **°Th that is in 
equilibrium with its parent U isotope) from lithogenic and authigenic uranium”. 
These corrections are negligible, with >98.9% of the total *°°Th scavenged from 
the water column and present as unsupported 7°°Th (see section below on 7°°Th 
normalization). Dust fluxes are calculated by normalizing the ?°Th concentra- 
tion to *°Th and using an average **’Th concentration in equatorial Pacific dust 
of 10.7 parts per million’. Justification of 7°°Th as a tracer for aeolian input is 
provided elsewhere’’. 

Biogenic opal was measured by alkaline extraction** at the University of British 
Columbia (Extended Data Fig. 4). Opal fluxes were calculated by 7°°Th normaliza- 
tion (see below). Opal fluxes reflect the burial of opal in the sediment, which is a 
function of both productivity and preservation, and so we compared the opal fluxes 
to 3'Pa/?*°Th, which is insensitive to preservation effects'*'”. Poor opal preser- 
vation may be a result of less silicified and thus more easily dissolvable diatom 
frustules, in response to the higher iron concentrations“! during the LGP, which 
would increase the *'Pa/**°Th to opal ratios. Instead we observe no change in the 
?31pa/?>°Th to opal ratio in the time slices presented here: 7.89 + 1.24 x 10? in the 
Holocene, 8.09 + 1.31 x 107? in the LGP, and 7.60 £0.53 x 107? overall (Extended 
Data Fig. 6). The temporal invariance of the 77!Pa/°Th to opal ratio indicates 
both that there was no substantial change in diatom silicification due to iron stress 
and that the degree of opal preservation did not change substantially between the 
LGP and the Holocene”. Therefore we interpret changes in opal flux as changes 
in productivity. 

Barium concentrations were analysed by total digestion at Texas A&M 
University of the same samples analysed for biogenic opal (Extended Data Figs 
4 and 5). The standard deviation on barium concentration data is <4.3%, with 
an average of 2.8% for all samples. Excess barium concentrations were calculated 
by subtracting the lithogenic barium component, determined using a lithogenic 
Ba/Th mass ratio of 51.4 in the upper continental crust*?. Excess barium fluxes 
were calculated by 7°°Th normalization (see below). Because dust fluxes are so 
low in the central equatorial Pacific, the lithogenic Ba fraction is small, <2% of 
the total Ba in the Holocene and <6% of the total Ba in the LGP. The lithogenic 
correction is insensitive to variability in the lithogenic Ba/Th ratio, so that varia- 
tions over an order of magnitude (Ba/Th = 10-100) induce only small deviations 
(<10%) in the excess Ba fluxes (Extended Data Fig. 5). Excess barium in marine 
sediment is almost exclusively found in the form of barite, a robust sedimen- 
tary component insensitive to both dissolution and diagenesis!°. The excess Ba 
flux may reflect the minimum amplitude for the change in productivity, as the 
proportional difference in excess Ba flux between the LGP and the Holocene 
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(up to 17%) is lower than that of #!Pa/?°°Th (up to 19%) and the opal flux (up 
to 33%) (Extended Data Fig. 7). 

Foraminifera-bound nitrogen isotopes were analysed following the procedure 
outlined by ref. 18. Approximately 10 mg of Globigerinoides ruber, Globigerinoides 
sacculifer, and Globorotalia tumida were picked from five Holocene samples and 
two glacial samples from each of the six sites. Samples were chemically cleaned 
to remove nitrogen contamination, oxidized with persulfate to convert organic 
nitrogen to nitrate, measured by chemiluminescence for nitrate concentration, 
and converted to nitrous oxide by a bacterial denitrifier that lacks a functional 
nitrous oxide reductase. The 6'°N of nitrous oxide was measured by gas chroma- 
tography isotope mass spectrometry on a purpose-built nitrous oxide extraction 
and purification system on-line with a Thermo MAT 253 stable isotope ratio mass 
spectrometer at Princeton University. 6!°N is reported relative to atmospheric No. 
Analyses of Holocene samples (n=5 per core) by H.R. and LGP samples (n= 2 
per core) by K.M.C. were performed in the same laboratory following the same 
procedure. The standard deviation for analysis of a picked and cleaned sample 
of a given foraminifera species is <0.3%o, while the reproducibility for a given 
coarse fraction, including the species picking and cleaning, averages 0.3%o, with 
a maximum of 0.8%. 
230Th normalization. The concentration of minor sedimentary components, such 
as opal and excess Ba, can vary in response both to changes in the input of the 
components themselves as well as to changes in the dilution by major sedimentary 
components, such as CaCQ3. In the Pacific, carbonate is better preserved in glacial 
periods than in interglacial periods“, and in the Line Islands we observed higher 
CaCO; concentrations (91%-93%) in the LGP than in the Holocene (86%-91%) 
(Extended Data Fig. 4). In the LGP we generally see lower opal and excess Ba con- 
centrations, consistent with increased dilution by CaCO3. To remove the dilution 
effects and isolate changes in the opal and excess Ba inputs, we normalize to “°Th, 
a constant flux proxy. 

230Th and 7*!Pa are produced in the water column by the decay of 24U and 
235U, respectively. Because U is highly soluble in sea water, it has a long residence 
time (400,000 years, or 400 kyr) and a fairly constant concentration (3.2 parts per 
billion) that scales conservatively with salinity’. Thus the production of “°Th 
and **!Pa is relatively uniform across the global ocean and depends primarily on 
the water depth and to a minor extent on salinity. “°Th is produced at a rate of 
0.0262 disintegrations per minute (d.p.m.) m~? yr~! (8539), while "Pa is pro- 
duced at a rate of 0.00245 dpm m~*yr7! (3), measured in activity units)*°. The 
production ratio of 3!Pa to ?°°Th (8230/3231 = 0.093) is assumed to be constant 
across the global ocean. 

Unlike U, both Th and Pa are practically insoluble in sea water and thus have 
relatively short residence times in the ocean (20-40 years and 100-200 years, 
respectively)*”. ?°°Th and 7*!Pa are removed from sea water by scavenging onto 
settling particles, and they are buried in the underlying sediments unsupported 
by their parent nuclides. This excess °Th and 7*!Pa in the sediment decays over 
time with respective half-lives of 75.69 kyr and 32 kyr (ref. 48). 

The residence time of °Th in the ocean is so short (20-40 years) compared 
to its half-life that virtually all of the 73°Th produced by U decay in sea water 
is removed to sediments by scavenging. The residence time of “°Th is also 
much less than the timescale for lateral transport by mixing from regions of 
low scavenging intensity (low particle flux) to regions of high scavenging inten- 
sity. Consequently, throughout the ocean the flux of 7°Th carried to the sea 
bed by sinking particles is within ~30% of its production rate in the overlying 
water column**. Given a rate of supply that depends mainly on water depth, the 
concentration of excess **°Th in the underlying sediment is a function of the 
sediment rain rate, or bulk mass flux (BMF). Higher BMF will dilute the excess 
3°Th concentration in the sediment. Thus the BMF can be calculated by dividing 
the integrated °°Th production in the overlying water column (/3z) by the excess 
initial 7°°Th, the concentration of excess 7°°Th in the sediment corrected for decay 
since deposition*®. *°Th-normalized fluxes of specific sedimentary components 
i, such as opal and excess Ba, can be calculated as f, x BME, where fis the fraction 
of component i in the sediment. Normalization to °Th removes the effects of 
dilution, and it can correct for lateral rather than vertical sediment inputs (such 
as focusing and winnowing). 

The validity of the 2°Th-normalization technique has generated much debate 
in the palaeoceanographic community“? *1. 7°°Th systematics have been called into 
question with regard to (1) dependence on age models and propagated uncertain- 
ties; (2) changes in production of 230TH by uranium decay in the water column; 
(3) size fractionation of **°Th as result of sediment focusing and winnowing; and 
(4) changes in the local rate of scavenging of 7°Th from the water column by 
sinking particles. Although these present important considerations when inter- 
preting “°Th-normalization, several lines of evidence nevertheless justify its usage 
in this study. 
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Anomalously low mass fluxes could be inferred to be a result of erroneously 
overcorrecting the radioactive decay of excess °°Th in LGP sediments. This 
uncertainty becomes increasingly important as sediment ages, particularly after 
several half-lives of *°Th (for example, >400 kyr). The relatively short timescales 
investigated here (15-20 kyr) are unlikely to be sensitive to small changes in the 
age models, as the age correction from excess **°Th to excess initial *°Th is only 
of the order of a 1.6%-30% increase in ?°°Th concentration. For example, to 
equate the Holocene and LGP equatorial opal fluxes, holding all isotopic con- 
centrations constant, the age of the LGP sample would need to be 2,000 years 
old rather than 21,400 years old. Such an extreme error in the age assignment 
can be ruled out because the cores presented here have internally consistent 
age models constrained by radiocarbon and by §'8O. Furthermore, evidence for 
systematic age offsets would be discernible in the *7'Pa/?*°Th ratio. Because *7!Pa 
has a shorter half-life than *°Th, the correction for radioactive decay is much 
more sensitive to age assignments, with the result that the sense of the error in 
the *?!Pa/**°Th ratio is opposite to the sense of the error in 7°Th-normalized 
flux. Increasing the mass flux (for example, by adjusting the sample age to be 
younger) will simultaneously decrease the **'Pa/**°Th ratio. The presence of 
both lower 77!Pa/?°°Th and lower *3°Th-normalized fluxes during the LGP 
indicates that the lower palaeoproductivity is not an artefact of inaccurate age 
corrections. 

Changes in production of ?°Th by uranium decay in the water column are 
similarly unlikely. The residence time of dissolved uranium in the oceans is so 
long (200-400 kyr) that the production rate of “°Th cannot change substantially 
during the 15-20-kyr interval between the LGP and Holocene. 

Although sediment winnowing and focusing may influence the absolute *°Th 
concentrations, it is improbable that they induce relative offsets between the 
Holocene and the LGP. First, it has been suggested that *°Th-normalization of 
coarse carbonate may be unreliable, but recent evidence>” supports its utiliza- 
tion in the fine sediment fraction, which is compatible with our application of 
Th-normalization to dust, barite and opal. Second, focusing factors, which measure 
the relative contributions of lateral and vertical sediment inputs, are nearly identical 
within the time slices: 1.28 + 0.27 in the LGP and 1.25 + 0.32 in the Holocene. This 
consistency between the time slices makes it unlikely that the observed changes in 
dust flux and productivity are an artefact of the **°Th systematics. Lastly, the slight 
sediment focusing (focusing factor >1) at these sites mitigates the grain size and 
sorting effects on the isotopes that are associated with sediment redistribution™ 
and particularly with winnowing”. As pointed out in ref. 55, moderate variations 
in lateral focusing do not appear to be influential in modifying the #!Pa/?°Th 
isotope ratios or their reflection of the local particle fluxes. The changes observed 
here are approximately 8% between time slices. 

Finally, “°Th-normalization assumes that all (100%) of the *°°Th produced in 
the water column is scavenged and buried in the sediment, but lower scavenging 
efficiencies (for example, 80%) would result in lower 7°°Th concentrations and 
higher inferred mass fluxes. The only way to increase the scavenging efficiency of 
?30Th is to increase the flux of particles responsible for scavenging thorium from 
the water column. The flux of *°Th scavenged from the water column is quite 
insensitive to changes in particle flux®*. Although not rigorously calibrated, we 
might qualitatively estimate that a doubling of particle flux would lead to a 10% 
increase in flux of °Th. If the 7°°Th-normalized mass flux were accurate initially, 
then the result of doubled particle flux would be to underestimate the true flux 
by 10%. Specifically, we would infer an 81% increase in flux, whereas the true 
change would be a 100% increase. Nevertheless, the reconstructed signal shows the 
direction of change accurately, even though the amplitude is lower than it should 
be. Consequently, when we derive a lower particle flux during the LGP using the 
?30Th-normalization method, we can be confident of the direction of the change, 
and we know that if there is a bias in the derived result then the actual change was 
even greater than the one we infer. 

Denitrification and the 5!°N of upwelling waters. The eastern tropical Pacific 
contains two of the world’s largest oxygen-deficient zones. Under these suboxic 
conditions, bioavailable nitrogen is lost via water column denitrification, which 
does not go to completion and thus imparts a strong fractionation to the nitrate 
pool, generating increased 6'5N in the residual nitrate. This high 6'°N nitrate is 
distributed from the Oxygen Minimum Zone throughout the Pacific Ocean by 
circulation and remineralization of organic matter®”. This 6!°N increase is recorded 
by organic matter deposited in the sediment, so that sedimentary 6'°N will tend 
to increase as a function of denitrification intensity. The influence of denitrifica- 
tion on sedimentary 6'°N is highest where partially consumed nitrate from an 
oxygen-deficient zone is upwelled into the surface, as occurs along the margin 
of the eastern tropical Pacific’; farther afield, the signal is diluted by lower 6!°N 
nitrate and, on a basin and global scale, countered by the input of low SN from 
nitrogen fixation. The net result is that the &°N of nitrate in the Equatorial 


Undercurrent, which upwells in the central equatorial Pacific, is largely unchanged 
from the initial 6'°N of nitrate in Subantarctic Mode Water*®. The Equatorial 
Undercurrent originates in the far western Pacific, and its eastward flow is rapid, 
which may explain why the 6!°N of its nitrate is so weakly affected by water column 
denitrification occurring in the eastern tropical Pacific. 

The intensity of water column denitrification has been reconstructed to be 
lower in the LGP than in the Holocene!®*?-®. With less elevation in the 6!5N 
of nitrate from the oxygen-deficient zones, bulk sedimentary 6!°N was lower 
by as much as 2.2%o in the eastern tropical Pacific’. However, the lower 6!5N 
does not appear to have propagated into the Equatorial Undercurrent. Records 
from the western equatorial Pacific, where bulk sedimentary 6!°N may reflect 
changes in the Equatorial Undercurrent source water, show little change: 
0.03-0.14%o lower in the LGP than in the Holocene”. Thus, although we 
cannot eliminate the potential influence of reduced water column denitrification 
on the FB-8!°N from the central equatorial Pacific (presented here), we infer 
that the effects are likely to be minimal. The north-south gradient in FB-5!°N 
is insensitive to changes in the 6'°N of the nitrate upwelled along the Equator. 
Thus, the lack of change in this gradient between the Holocene and the LGP 
is further support that iron fertilization did not drive a substantial increase in 
nitrate consumption during the LGP, regardless of glacial—interglacial changes in 
denitrification. 
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Extended Data Figure 1 | Line Islands core locations. Core sites are 
identified by their multicore numbers. The respective piston core numbers 
as well as latitude, longitude and depth are provided in Extended Data 
Table 1. The bathymetric map was generated using GeoMapApp™. 
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Extended Data Figure 2 | Radiocarbon-based age models for all six inferred to represent Marine Isotope Stage 2 (ref. 31). Age models were 
cores. Core chronologies were established with four radiocarbon dates established via linear interpolation between radiocarbon dates 


on G. ruber: 0cm and 8 cm depth in the multicores (MC), and two depths (ka, thousands of years ago), which are provided in Extended Data Table 2. 
(>8cm) in the Big Bertha piston cores (BB) bracketing the §!8O maximum 
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Extended Data Figure 3 | Time series for 7*'Pa/*°°Th data within the LGP time slice. The relatively constant values for each core argue against any 
systematic bias from bioturbation of transient features, such as a deglacial productivity peak. 
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Extended Data Figure 4 | Concentrations of opal, excess Ba and CaCO3. CaCO; is the dominant sedimentary component, and systematic changes in its 
concentration dilute the concentrations of minor sedimentary components such as opal and excess Ba (Baxs). 
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Extended Data Figure 5 | Lithogenic correction for Ba excess flux 
calculation. Samples from the Holocene (0-10,000 years ago) are orange; 
samples from the LGP (17,000-27,000 years ago) are blue. In the top panel, 
excess Ba concentrations were calculated by subtracting the lithogenic Ba 
fraction from the total Ba concentration using a lithogenic Ba/Th ratio of 
51.4, based on the average elemental concentrations in upper continental 
crust’. The lithogenic corrections are small, <2% for the Holocene and 
<6% for the LGP. Excess Ba concentrations were then multiplied by the 
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total mass flux (7?°Th-normalized) in order to generate the excess Ba flux. 
The bottom panel shows a comparison of excess Ba fluxes calculated using 
different lithogenic Ba/Th ratios, ranging from 10 to 100, normalized to 
the fluxes determined using a Ba/Th ratio of 51.4. Ratios less than 51.4 
result in slightly higher excess Ba fluxes, while ratios greater than 51.4 
result in slightly lower excess Ba fluxes. Overall, the excess Ba fluxes are 
insensitive to the Ba/Th ratio chosen, with deviations only over a range of 
+10%. ppm, parts per million; ppb, parts per billion. 
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Extended Data Figure 6 | °°Th-normalized opal flux, excess Ba flux, 
and ?°!Pa/??°Th., Samples from the Holocene (0-10,000 years ago) 

are orange; samples from the LGP (17,000-27,000 years ago) are blue. 
?31Pa/*3°Th is positively correlated with the opal flux (r? =0.90, P < 0.001) 
and excess Ba flux (7? = 0.85, P< 0.01). Excess Ba flux and opal flux are 
also positively correlated (r? = 0.63, P< 0.01). For opal flux, the correlation 
with *7!Pa/*°Th is especially strong during the LGP (opal flux r? = 0.98, 
P<0.001). The relationship (that is, the slope) between **'Pa/*°Th and 
opal flux may be altered by changes in preservation, which affects opal 
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but not 7*'Pa/**°Th. Poor opal preservation (for example, if the diatom 
frustules were less silicified) would elevate the 7*!Pa/**°Th relative to 

the sedimentary opal flux, thus steepening the slope. However, the 
relationship between **'Pa/**°Th and opal flux is temporally invariant, 
with slopes of 7.89 + 1.24 x 10? in the Holocene, 8.09 + 1.31 x 107? in the 
LGP, and 7.60 + 0.53 x 10~? overall. We interpret these results to indicate 
that there was no important change in opal preservation between the LGP 
and Holocene and, therefore, that frustule silicification (potentially related 
to iron stress*?-*“), similarly remained unchanged. 
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Extended Data Figure 7 | Absolute and relative change for each 
productivity proxy versus the Holocene value of that proxy. The absolute 
change (Holocene minus LGP) in productivity proxy is shown in the top 
panels; the relative change is shown in the bottom panels. Excess Ba flux 

is shown in green, opal flux in purple, and **'Pa/*?°Th in red. The greatest 
change in productivity occurred at the sites with the highest Holocene 
productivity values. The core that shows a negative change in productivity 
is the most northerly core (7° N), which is outside the high-nutrient, 
low-chlorophyll equatorial upwelling zone, and thus displays different 
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glacial—interglacial nutrient dynamics. The relative change in productivity 
(Holocene to the LGP) is fairly constant across the five cores within the 
equatorial upwelling zone at 11% for excess Ba flux, 24% for opal flux, 

and 8% for 77!Pa/?*°Th. The inter-proxy difference may reflect a nonlinear 
scaling of productivity with 77!Pa/**°Th, because these radionuclides are 
scavenged to some extent by all particle phases. In practice, this difference 
suggests that 77!Pa/??°Th may provide a conservative estimate for changes 
in productivity, with true productivity changes potentially at much higher 
amplitude. Error bars are 2¢ and indicate analytical precision. 
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Extended Data Table 1 | Locations of the Line Islands cores 
MC # BB#  Lat(N) _Long("W) Depth (m) 


14 13 -0.22 156.0 3049 
21 20 1.27 157.3 2850 
26 25 2.46 159.4 3545 
29 28 2.97 159.2 3152 
33 32 5.20 160.4 2933 
39 36 6.83 161.0 2859 
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Extended Data Table 2 | Radiocarbon ages for age model generation 


Goin Depth Radiocarbon e Calendar Age Fe 
(cm) Age (yrs) (yrs) 

14MC 0 3460 35 3340 110 
14MC 8 5770 30 6200 90 
13BB 34 14580 40 17260 180 
13BB 53 20760 70 24440 250 
21MC 0 2330 25 1950 80 
21MC 8 3500 35 3390 110 
20BB 40 14685 40 17400 200 
20BB 55 19100 70 22540 160 
26MC 0 3900 35 3880 130 
26MC 8 4740 35 4980 130 
25BB 25 15950 60 18800 140 
25BB 35 22400 100 26200 230 
29MC 0 4200 15 4300 80 
29MC 8 4770 20 5020 110 
28BB 54 16600 60 19550 220 
28BB 76 25400 160 29030 360 
33MC 0 4930 35 5270 150 
33MC 8 5260 20 5610 40 
32BB 25 15350 75 18170 220 
32BB 35 19300 75 22750 240 
39MC 0 4730 45 4960 130 
39MC 8 5800 20 6230 80 
36BB 15 11650 25 13150 140 
36BB 25 16700 90 19670 190 


Error bars are 2¢ and indicate analytical precision 
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Proton-gated Ca**+-permeable TRP channels 
damage myelin in conditions mimicking ischaemia 


Nicola B. Hamilton!, Karolina Kolodziejczyk!+, Eleni Kougioumtzidou! & David Attwell! 


The myelin sheaths wrapped around axons by oligodendrocytes 
are crucial for brain function. In ischaemia myelin is damaged in a 
Ca”t-dependent manner, abolishing action potential propagation’. 
This has been attributed to glutamate release activating Ca?*- 
permeable N-methyl-p-aspartate (NMDA) receptors‘. Surprisingly, 
we now show that NMDA does not raise the intracellular Ca?+ 
concentration ([Ca*];) in mature oligodendrocytes and that, 
although ischaemia evokes a glutamate-triggered membrane 
current’, this is generated by a rise of extracellular [K*] and 
decrease of membrane K* conductance. Nevertheless, ischaemia 
raises oligodendrocyte [Ca”*];, [Mg”*]; and [H*];, and buffering 
intracellular pH reduces the [Ca**]; and [Mg”*]; increases, showing 
that these are evoked by the rise of [H*];. The H*-gated [Ca**]; 
elevation is mediated by channels with characteristics of TRPA1, 
being inhibited by ruthenium red, isopentenyl pyrophosphate, 
HC-030031, A967079 or TRPA1 knockout. TRPA1 block reduces myelin 
damage in ischaemia. These data suggest that TRPA 1-containing ion 
channels could be a therapeutic target in white matter ischaemia. 

Ischaemia blocks action potential propagation through myelinated 
axons’. Electron microscopy’ and imaging of dye-filled oligodendrocytes? 
show ischaemia-evoked Ca”*-dependent damage to the capacitance- 
reducing myelin sheaths, which causes loss of action potential prop- 
agation. Glutamate receptor block reduces myelin damage and 
action potential loss?~”, and glutamate evokes a membrane current 
in oligodendrocytes mediated by AMPA (a-amino-3-hydroxy-5- 
methyl-4-isoxazolepropionic acid)/kainate and NMDA receptors” *. 
Thus, oligodendrocyte damage is thought to be excitotoxic: as for neu- 
rons in ischaemia, a rise of glutamate concentration® caused by reversal 
of glutamate transporters in oligodendrocytes and axons”"” activates 
receptors that raise” oligodendrocyte [Ca”*];, thus damaging the cells. 

However, although AMPA/kainate and NMDA receptors regulate 
oligodendrocyte precursor development!!"”, these receptors are down- 
regulated as the cells mature'?-'°. How can mature oligodendrocytes be 
damaged excitotoxically, if they express low levels of glutamate recep- 
tors? To investigate how oligodendrocyte [Ca’*]; is raised in ischaemia, 
we characterized ischaemia-evoked membrane current and [Ca?*]; 
changes in cerebellar white matter oligodendrocytes. 

Solution mimicking ischaemia (see Methods) evoked an increas- 
ing inward current in oligodendrocytes (Fig. 1a, b), often with a faster 
phase that was obscured when responses in many cells were averaged 
(Fig. 1c). When applied from before the ischaemia, NBQX and D-AP5 
reduced the ischaemia-evoked current by 66% (Fig. 1c, d), while mGluR 
block had no effect (Extended Data Fig. 1a). Preloading for 30 min with 
the glutamate transport blocker PDC, to prevent ischaemia-evoked 
glutamate release by reversal of transporters in the white? and grey'® matter, 
also reduced the inward current (by 68%, Fig. 1c, d), while blocking 
other candidate release mechanisms had no effect (Extended Data Fig. 1a). 
Thus, glutamate release by reversed uptake helps to trigger the ischaemia- 
evoked current. Notably, however, current flow through glutamate 
receptors generates only a small fraction of the sustained inward 


current evoked by ischaemia, since applying NBQX and D-AP5 from 
200 s after ischaemia had started produced only a non-significant 21% 
suppression of the ischaemia-evoked inward current (Fig. 1d). 

In neurons, an ischaemia-evoked inward current triggered by gluta- 
mate release, but maintained by non-glutamatergic mechanisms, gener- 
ates the ‘extended neuronal depolarization (END) that evokes neuronal 
death'”. However, the ischaemia-evoked current in oligodendrocytes 
was not prevented by removing external Ca”*, nor by gadolinium, 
which both block the END” (Fig. 1d, e), implying a different mecha- 
nism maintains the inward current triggered by glutamate. 

Unlike in neurons, where ischaemia evokes a conductance increase 
mediated by ionotropic glutamate receptors'®, ischaemia decreased the 
conductance of oligodendrocytes (Fig. 1f, g). The suppressed current 
reversed below the K* reversal potential (Ex =—104 mV), at —121 mV 
with 10 mM (Fig. 1f, Extended Data Fig. le) and —118 mV with 0.5 mM 
HEPES (Fig. 1g). This is expected if ischaemia decreases the mem- 
brane K* conductance, while [K*], rises (due to Na*/K* pump inhi- 
bition throughout the slice) which increases the inward current at all 
potentials (see Extended Data Fig. 1b, d; Supplementary Information). 
K*-sensitive electrodes showed that [K*], in the white matter initially rose 
slowly during ischaemia, but then increased more abruptly, in parallel 
with the membrane current (Fig. 1h). The peak rise was 2.35 + 0.13 mM 
(n= 12) in the white matter and 2.48 +0.35 mM (n=4) in the adjacent 
grey matter (where it reflects the anoxic depolarization of neurons’®). 
The conductance decrease described above produced 32%, while the 
[K*], rise produced 68%, of the inward current in oligodendrocytes at 
—74 mV (see Supplementary Information). Thus, changes in K* fluxes 
generate the ischaemia-evoked inward current. 

Could part of the NMDA-evoked inward current in oligodendro- 
cytes* also reflect a [K*], rise? Extracellular Cs* blocked the NMDA- 
evoked current while intracellular MK-801 had no effect (Extended 
Data Fig. 2a—d), suggesting that most of the NMDA-evoked current 
is generated by [K*], rising, rather than by oligodendrocyte NMDA 
receptors. Applying NMDA or raising [K*],, and correlating the result- 
ing inward current with the [K*], rise occurring (see Supplementary 
Information; Extended Data Fig. 2e, f), we found that at least 49% of 
the NMDA-evoked current was attributable to the [K*], rise that it 
produced. Since mature oligodendrocytes express few NMDA recep- 
tors!?-!, this presumably reflects NMDA depolarizing neurons or 
astrocytes in the slice and releasing K~. 

These data challenge the idea*~> that, during ischaemia, NUDA 
receptors in mature oligodendrocytes generate a prolonged calcium 
influx which damages the cells. We therefore investigated the ion con- 
centration changes evoked in oligodendrocytes by activation of NMDA 
receptors, using Ca?+-, Nat- and Kt-sensitive dyes loaded into cells 
from the pipette. When 100 1M NMDA was applied to whole-cell 
clamped cerebellar granule neurons at —74 mV, as expected it evoked an 
inward current, and raised [Ca?*]; and [Na‘]; (Extended Data Fig. 3). 
In contrast, although NMDA evoked an inward current in oligoden- 
drocytes, it generated no [Ca?*]; or [Na*]; elevation; indeed [Na*]; 
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Figure 1 | Ischaemia evokes an inward current in oligodendrocytes 

by altering K* fluxes. a, Whole-cell clamped rat oligodendrocyte. Inset, 
Alexa dye in processes around an axon. b, Ischaemia-evoked membrane 
current in single cell. c, Current in 179 control cells, 12 cells in slices 
exposed to 25\1M NBQX and 200|1M D-AP5 from before ischaemia, or 

9 cells in slices preloaded! with 1 mM PDC. d, Current (normalized to 
interleaved control cells) from 8-10 min after start of ischaemia in cells 
preloaded with PDC, exposed to NBQX+AP5 throughout ischaemia or 
from 200s after ischaemia starts, or exposed to NBQX or AP5 alone or to 
zero-Ca’~ solution (with 501M EGTA) throughout ischaemia. Mann- 
Whitney P values compare with control cells; cell numbers shown on 
bars. e, Effect of Gd?* (100 1M) on ischaemia-evoked current at 8-10 min 
(Mann-Whitney P = 0.83). f, I-V relation of 10 cells before and after 

5 min ischaemia (10 mM HEPES internal). g, Ischaemia-evoked current in 
10 cells with 0.5 mM and 9 cells with 50 mM internal HEPES. Ischaemia 
decreased cell conductance by 2.1 + 0.7 nS near —70 mV in 11 cells using 
10 mM, and by 2.3 + 0.6 nS in 10 cells using 0.5 mM, internal HEPES; 

50 mM HEPES abolished the decrease (Fig. 3i). h, Change of [K*], in grey 
matter (GM, granule cell layer), and in white matter (WM, different slice) 
with simultaneously recorded oligodendrocyte current. Error bars, s.e.m. 


decreased after applying NMDA (Fig. 2a). NMDA raised [K*]; how- 
ever. Similar concentration changes were seen at the soma (Fig. 2a) and 
in the internodal processes where NMDA receptors may be located”~* 
(Fig. 2b). Like NMDA, raising [K*], lowered [Na*]; (Fig. 2a, b). A 
likely explanation is that NMDA raises [K*], (Fig. 2c), which decreases 
[Na‘]; by activating the Nat/K* pump. 

The absence of a rise of [Ca”*]; and [Na‘]; is surprising if oligoden- 
drocytes express NMDA receptors?~*. Conceivably, NMDA receptors 
might pass ions into a compartment in their myelinating processes 
which only certain Ca**-sensing dyes such as X-Rhod-1 can access”. 
However, whether X-Rhod-1 was loaded as an acetoxymethy] ester? 
or from the pipette, we observed no NMDA-evoked change of [Ca?*]; 
in the myelinating processes (Fig. 2d). Nevertheless, we could detect 
spontaneous [Ca**]; rises propagating through myelinating processes 
in 55% of oligodendrocytes (Fig. 2e). 

To confirm that ischaemia raises” oligodendrocyte [Ca”*];, we loaded 
the Ca**-sensing dye Fluo-4 (with Alexa Fluor 594, for ratiometric 
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Figure 2 | NMDA does not elevate [Ca”*]; in oligodendrocytes. 
a, b, Rat oligodendrocyte membrane current (lower traces, a) and 
background-subtracted fluorescent dye ratio (R, see Methods, concentration 
increases are upwards for all dyes) when measuring [Ca**]; with Fura-2, 
[Na*], with SBFI, and [K*]; with PBFI; 100j1.M NMDA was applied, or 
[K*], was raised from 2.5 to 5mM, with fluorescence measured in soma 
(a) or myelinating processes (b). Right panels, peak fluorescence change 
normalized to evoked current (number of cells on bars) c, NUDA-evoked 
current and simultaneously recorded [Kt],. d, Measuring [Ca?*]; with 
X-Rhod-1, loaded from pipette (n = 6) or as an acetoxymethyl ester? (n = 15), 
reveals no NMDA-evoked [Ca?*]; rise. e, Spontaneous [Ca?*]; transients in 
four myelinating processes confirm Fura-2 is working. Error bars, s.e.m. 


imaging) into oligodendrocytes from a whole-cell pipette (in canta 
clamp mode, allowing voltage changes, in case voltage-gated Ca”* 
channels raise [Ca**];). Ischaemia increased [Ca?*]; in the soma and 
processes over ~10 min. This was abolished if extracellular calcium was 
removed (Fig. 3a, b), and reduced by removing external K~ (Fig. 3c), 
suggesting that the ischaemia-evoked [K*], rise promotes calcium 
entry from the extracellular solution. However, contradicting the 
earlier report’, blocking NMDA receptors with MK-801, D-AP5 and 
7-chloro-kynurenate, or blocking NMDA and AMPA/kainate recep- 
tors with NBQX and D-AP5 while blocking voltage-gated Na* and 
Ca** channels and GABAg receptors, did not prevent the [Ca?*]; rise 
(Fig. 3a, b). Similarly, when PDC-preloading reduced transporter-me- 
diated glutamate release, the [Ca”*] rise was unaffected (Fig. 3c). 

Similar experiments using a Mg”*-sensitive dye revealed that [Mg”*]; 
also rises in ischaemia (Fig. 3d). This was not due to ATP breakdown, 
which releases Mg”, since the [Mg’*]; rise was abolished by remov- 
ing extracellular Mg”* (Fig. 3d) implying that Mg?* enters across the 
cell membrane. Surprisingly, ischaemia did not raise [Na‘]; (Fig. 3e). 
Thus ischaemia activates a membrane conductance that allows entry 
of divalent ions. 

Seeking an agent that decreases membrane K* conductance and 
activates Ca”+ entry, we measured the ischaemia-evoked pH change 
in oligodendrocytes. Ischaemia increased [H*]; on the timescale seen 
for [Ca?*]; (Fig. 3f). A similar (but smaller) [H*]; rise was evoked by 
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Figure 3 | Ischaemia evokes a [Ca”*]; and [Mg”*]; rise gated by internal 
protons. a, Ratiometric Fluo-4/Alexa-Fluor-594 signals ([Ca”*];) and 
membrane potential (Vin) in rat oligodendrocytes exposed to ischaemia 
(starting at arrow), or ischaemia in zero [Ca”*]., or with drugs at 
concentrations (jtM): AP5 50, MK-801 50, 7-chlorokynurenate (7CK) 

100, NBQX 25, GABAzine (Gz) 20, TTX 1, Cd?* (to block Ca?* channels) 
100. b, Mean data from experiments like a (cell numbers on bars; P values 
compare with soma or process control values). c, Data as in a after PDC- 
preloading or in 0 mM [K* ]patn. d, Ischaemia-evoked [Mg**]; rise monitored 
with Mag-Fluo-4 in normal and Mg’*-free solution. e, Ischaemia-evoked 
[Nat]; change monitored with SBFI (6 cells). f, Ischaemia-evoked [H*]; rise 
monitored with BCECF with 0.5 mM and 50 mM internal HEPES (P from 
Mann-Whitney test). g, [K*], and oligodendrocyte membrane potential 
(Vm) with 2.5 or 0mM [K* paths before (Resting) and during ischaemia 
(Isch peak), and change produced by ischaemia (AlIsch). h, Effect of 
removing K* from bath solution on [H*]; in control conditions (relative 

to value at start of K* removal), and [H*]; increase evoked by ischaemia 
(normalized to value at start of ischaemia) with 2.5 or 0OmM bath K*. 

i, High [HEPES]; blocks ischaemia-evoked decrease of membrane 
conductance. j, k, Ischaemia-evoked rise of [Ca”*]; (j) and [Mg**]; (k) are 
inhibited with 50 mM internal HEPES. |, m, Uncaging H* with light (bars) 
raises [Ca**]; (I, an effect reduced by 200 1M IPP or 80\1M HC-030031) and 
[Mg?*]; (m), but not when caged H* is omitted from the pipette. P values in 
1 from Mann-Whitney tests. Error bars, s.e.m. 


elevated [K*], or NMDA (Extended Data Fig. 4a), suggesting that the 
ischaemia-evoked [K*], rise partly generates this pH change. To inves- 
tigate this, we removed external K*, which reduced the [K*], in the 
slice from 2.46 +0.02 mM (n= 13) to 0.99 +0.30 mM (n=4) (Mann- 
Whitney P=0.001, Fig. 3g), and hyperpolarized the resting potential 
by 7 mV (—84.0+ 4.7 to —77.2+4.3 mV, n=4, P=0.002, paired t-test, 
Fig. 3g). The ischaemia-evoked [K*], rise and depolarization were 
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unaffected by K* removal (Fig. 3g), but the [H*]; initially and dur- 
ing ischaemia were reduced (Fig. 3h). These data and those in Fig. 3c 
suggest that the ischaemia-evoked [K*], rise helps to acidify the cell, 
which in turn evokes Ca”* entry. 

Using an internal solution containing 50 mM HEPES, the ischaemia- 
evoked rise of [H*]; was, as expected, greatly reduced (Fig. 3f). This 
prevented the decrease of K* conductance (Figs 1g, 3i), consistent 
with intracellular acidity suppressing activity of tonically-active Kt 
channels”. Strikingly, however, buffering intracellular pH also pre- 
vented the ischaemia-evoked rise of [Ca**]; and [Mg”*]; (Fig. 3j, k), 
implying that the ischaemia-evoked [H*]; rise activates entry of these 
cations into the cell. Consistent with this, uncaging protons in oligo- 
dendrocytes raised [Ca?*], and [Mg?*]; (Fig. 31, m). 

Few channels allow entry of Ca** and Mg?” better than Na‘, but 
many TRP channels share this property”’ (and TRP channel activa- 
tion can cause ischaemic damage to neurons”! and astrocytes”). Of 
these channels, only?0?3.4 TRPAI and TRPV3 are known to be acti- 
vated by intracellular H*, so we applied modulators of these channels 
and examined the effect on oligodendrocyte [Ca”*]; (TRP agonist 
and blocker specificity are discussed in Supplementary Information). 
The TRPA1/TRPV3 blocker isopentenyl pyrophosphate” (IPP) and 
the TRPA1 blocker?® HC-030031 slowed and reduced the [Ca?*]; 
rise evoked by uncaging H* in the cell (Fig. 31). The TRPA1/TRPV3 
agonists?” menthol, vanillin, carvacrol (Cv) and 2-APB all raised 
[Ca’*]; in oligodendrocyte somata and myelinating processes, as did 
the TRPA1 agonists”” AITC, polygodial and flufenamic acid, while the 
TRPV3 agonists???” camphor and farnesyl pyrophosphate (FPP) did 
not (Fig. 4a, b). Thus, TRPA1 subunit-including channels contribute to 
these responses, but TRPV3 channels are not needed. The carvacrol- 
evoked rise of [Ca7*]; was reduced by HC-030031 which blocks TRPA1 
but not TRPV3”, by the TRPA1/TRPV3 blocker isopentenyl pyro- 
phosphate” (IPP), and by TRPA1 knockout (Fig. 4b), again implying 
involvement of TRPA1 channels. It was unaffected by buffering [H*]; 
(Fig. 4b), consistent with Ca’* entry via TRPA1 being downstream of 
the [H*]; rise, as seen with H* uncaging (Fig. 31). 

Using in situ hybridization, a TRPA1 probe labelled cerebellar 
white matter in rat and mouse, while a TRPV3 probe labelled rat only 
(Extended Data Fig. 5a). Immunocytochemistry revealed that TRPA1- 
and TRPV3-expressing cells included myelinating oligodendrocytes 
expressing Olig2 and CC1 (Extended Data Fig. 5b-d). 

Consistent with ischaemia raising [Ca?*]; by activating TRPA1- 
rather than TRPV3-containing channels, the general TRP blockers*””” 
ruthenium red (RuR, 10 1M) and La*+ (1 mM) and the TRPA1/TRPV3 
blocker?° IPP (200 1M) reduced the [Ca?*]; rise, as did HC-030031 
and A967079, which block TRPA1 but not TRPV3”°”S (Fig. 4c-e). 
Knockout of TRPA1 slowed and halved the ischaemia-evoked [Ca?*]; 
rise, and the TRPA1/TRPV3 blocker IPP produced no further reduc- 
tion in the knockout (suggesting no contribution of TRPV3; see Fig. 4f), 
while knockout of TRPV3 did not affect the [Ca’*]; rise, and the TRPA1 
blocker HC-030031 slowed and halved the rise occurring in the TRPV3 
knockout (Fig. 4g). Blockers of many other TRP channels had no effect 
(Extended Data Fig. 6). Thus TRPA1 is the dominant contributor to the 
ischaemia-evoked rise of [Ca”*]; in oligodendrocytes. 

The larger (70%) block of the ischaemia-evoked [Ca?*]; rise by 
the TRPA1 blocker HC-030031 than by TRPA1 knockout (50%: 
Figs 4e, f) suggests that there may be compensatory upregulation 
of another Ca** entry pathway in the TRPA1 knockout, which nor- 
mally generates only ~30% of the [Ca**]; rise. Introducing high 
pH-buffering-power solution into the cell blocked the [Ca?*]; rise 
in the TRPA1 knockout (Fig. 4h), implying that the non-TRPA1 
Ca** entry pathway is also H*-activated. Since the non-specific TRP 
blockers RuR and La** abolished the ischaemia-evoked [Ca?*]; rise, 
these data suggest that there is another Ca?*-permeable TRP chan- 
nel (neither TRPA1 nor TRPV3) that is activated by internal H* 
in oligodendrocytes and generates ~30% of the ischaemia-evoked 
[Ca?*], rise. 


28 JANUARY 2016 | VOL 529 | NATURE | 525 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


Carvacrol (Cv, 
b 
Dru: t 
oe m Soma = sie® 2p 
1.6 O Processes 5 esscSo xo 
lolojolo} to 
AITC ¢! 4 0.1 eu niu nul 
oO 4 + 
FFA eee aie| B10, 8 gras a8 
Menthol & $48 Smin & $0.8] 7 5 
Dolygodial 4 0.6 7206 
Polygodial § > Gon eS 
Vanillin = 40.4 g + 0.4 765 
Camphor 0.2 HC So2 5, 
oo aaa ‘030031 = a 
>< ‘ RS IPP oe AG 
(Ca, o1aR/R| SN eek Ok = (G22, TAPAI KO SOS & ve 
J g <$ J 
38min & fe S °> & ee 
Isch ~~ 
SCI 
Isch d » e Isch Isch 
x ae = 
3 8x o 
B150;8 2X Gan 2 
5 8125;]7 vi o2 of! 
Eo aq g Eo 
02 #8 5x arin] & x! 
i" 5 7 OF 
AR/R 3 z 50 IPP 3min 36 
inss ee AQ967079 ws 
RE 25 5 # 
xs £6 Hc 030031 & 
(ca), SoS x (Cat) a 
Vv 
f Isch Isch g Isch Isch h _ - 
ee P=0.01 a P=0.01 are lie 
+ cs =>— 
wt 2  p=0.004 Wt 3 peu res 
sgl = 107 11 @ 0.4,3 00 
£ S100) § ‘ TRPv3 ko £ §1°° & © o3|ercc 
TRPA1KO 8 $ oo 5 8 S100 7 8 gol laa 
DIEP 168 an TRPVS KOS g 5 S04 3 
2 4 2 
cariyf TAP SE 20 [cat “Hc 030031 & & S 9 
g x 2 
: 0 0.2 cq 0 e=04 
: « 
apie] SL & ARIR 3 £ Ooty F 
3 min ee °” of os BS NON 
4S FS 


oo 
Uv 
1 ie 
Pleo [i Y 
x § 
xo 
& 


“P=3x 10% P=6x 10-4 
P=2x 10% = 
§ ¢ 5 S85 
& 3 g ge8 
o xc =o & 
125, °¢§ ces 
2 esse 510]. 2 Be 
£1.00) i O28 80/8 
g ef[es 5 ls 
g 0.7578 ge 4015 
g 050 See = OSS 
= Oo = Oo 
2 Sees ZS SERS 
GB DO oer 2g ie 
a EF ee Ker 
KS VRE 
s* ss 
Ca we 


Figure 4 | TRPA1 mediates ischaemic Ca”* accumulation and myelin 
damage. a, A[Ca**]; to TRPA1/TRPV3 agonists (mM) (menthol 2, vanillin 

1, 2-APB 2), TRPV3 agonists (camphor 2, FPP 0.5) and TRPA1 agonists 
(AITC 0.5, FFA 1, polygodial 0.2). b, A[Ca?*]; to TRPAI/TRPV3 agonist 
carvacrol (2mM, in 141M TTX) is inhibited by TRPA1/TRPV3 antagonist 
isopentyl pyrophosphate (IPP, 200,1M, Mann-Whitney test on soma), TRPA1 
antagonist HC-030031 (80)1M), and TRPA1 knockout (KO) (Mann-Whitney 
test on processes), but not by 50 mM internal [HEPES] (P values compare 
with control). ¢, d, Ischaemia-evoked A[Ca’*]; is blocked by ruthenium red 
(RuR, 101M) (c), and La?* (d, 1mM, using HEPES-buffered external, Mann- 
Whitney test on soma). e, Block of ischaemia-evoked A[Ca?*]; by TRPA1/ 
TRPV3 blocker IPP (200|1M) and TRPA1 blockers HC-030031 (80,1M) and 
967079 (10\1M). Mann-Whitney P values compare with control. f, Ischaemia- 
evoked A[Ca**]; in wild-type mice, with TRPA1 knocked out (P=0.07 for 
processes) and with TRPA1/TRPV3 blocker IPP (2001M) also present. 

g, Ischaemia-evoked A[Ca?*]j in wild-type mice, with TRPV3 knocked out 
and with TRPA1 blockers HC-030031 (80M, 6 cells) or A967079 (10 4M, 

1 cell) also present (Mann-Whitney P values). h, Ischaemia-evoked A[Ca**]; 
in TRPA1 KO with 10 and 50 mM internal [HEPES]. i, Electron microscopy 
showing control optic nerves and myelin decompaction (white arrows) after 
60 min ischaemia or ischaemia in RuR (10 1M), or 4967079 (10j1M) and 
HC-030031 (80|1M) together (TRPA1 block). j-m, Lamella separations (j), 

g ratio (k), axon diameter (I) and axon vacuoles (m, electron microscopy 
shows vacuoles (black arrows) within axon and periaxonal space) in control, 
ischaemia alone or ischaemia with RuR (101M) or with A967079 (10,.M) and 
HC-030031 (80}1M) (TRPA1 block). Bar numbers are ‘images (axons): P values 
for j-m from Mann-Whitney tests, except 1 from Kolmogorov—Smirnoyv test. 
Error bars, s.e.m.; data from rat unless stated otherwise. 
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To assess the role of TRPA1-containing channels in evoking myelin 
damage, we exposed rat optic nerves to 60 min ischaemia. This led to 
disruption of myelin sheaths””, which we quantified by counting the 
regions of myelin decompaction (lamellar separation) per axon cross 
section (see Methods). Taking as baseline the level of decompaction 
that occurs in control nerves during processing for electron micros- 
copy, ischaemia increased decompaction (P=3 x 107), and ruthenium 
red or the TRPA1 blockers HC-030031 and A967079 (applied together) 
reduced this increase by 69% (P=3 x 107°) and 59% (P=2 x 107), 
respectively (Fig. 4i, j). Ischaemia did not affect the axon g ratio (Fig. 4k; 
see Methods), but increased axon diameter through swelling (Fig. 41). 
It also caused some axon vacuolization (Fig. 4m): vacuoles were seen 
in 4.3% of 2,390 control axons, but in 21% of 15,976 axons after ischae- 
mia. Vacuolization was not prevented by TRP channel block (Fig. 4m), 
suggesting different mechanisms for axon and myelin damage. 

Thus, ischaemic damage to oligodendrocytes differs fundamentally 
from that in neurons (Extended Data Fig. 7), where [Ca?*]; is raised 
by glutamate-gated receptors (and later by TRP channels activated by 
reactive oxygen species”!). Contradicting current ideas”, ischaemia 
does not damage oligodendrocytes by activating Ca’* entry through 
ionotropic glutamate receptors in their membranes. Instead, ischaemia- 
evoked sodium pump inhibition and glutamate release evoke a 
long-lasting rise of [K*], that, together with metabolic changes, acidifies 
the oligodendrocyte, activating H*-gated TRP channels through which 
Ca** enters. 

In the optic nerve, ischaemia-evoked Ca’* entry into oligodendro- 
cytes is blocked by NMDA receptor antagonists’, contradicting our 
demonstrations that NMDA evokes no [Ca?*]; rise in oligodendro- 
cytes (Fig. 2) and that the ischaemia-evoked [Ca?*]; rise is unaffected 
by NMDA receptor blockers (Fig. 3). Conceivably, in the optic nerve, 
NMDA receptors on astrocytes” make a greater contribution than in 
cerebellum to generating the ischaemia-evoked rise of [Kt], and thus 
the [Ca?*]; rise. 

TRPAI generates ~70% of the ischaemia-evoked [Ca7*]; rise, 
and TRPAI blockers reduce ischaemic damage to myelin (Fig. 4). 
Consequently, blocking oligodendrocyte TRPA1-containing channels 
may reduce myelin loss during the energy deprivation that follows 
stroke, secondary ischaemia caused by spinal cord injury, or hypoxia 
in multiple sclerosis*°. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 


Animals. Experiments used Sprague-Dawley rats or transgenic mice of either sex. 
Data are from rats unless stated otherwise. Animal procedures were carried out 
in accordance with the guidelines of the UK Animals (Scientific Procedures) Act 
1986 and subsequent amendments. TRPV3 knockout (KO) mice were obtained 
from JAX (http://jaxmice.jax.org/strain/010773.html). TRPA1 KO mice were 
obtained as a double knockout with TRPV1 knocked out (kindly provided by J. 
Wood and J. Sexton). TRPV1 does not contribute to the ischaemia-evoked [Ca?*]; 
rise described here because the TRPV1 antagonist”””’ capsazepine did not reduce 
the ischaemia-evoked [Ca”*]; rise in rat oligodendrocytes (Extended Data Fig. 6a) 
and the TRPV1 agonists”””” capsaicin (10 |.M) and camphor (2 mM) did not evoke 
a [Ca’*]; rise (see Specificity of drugs acting on TRP channels section and Fig. 4a). 
Wild-type and (double) KO mice were from a colony obtained by breeding mice 
doubly heterozygous for the TRPA1 and TRPV1 knockouts. The wild-type and KO 
mice compared shared the same doubly heterozygous grandparents. 

Brain slice preparation. Cerebellar slices (225 j1m thick) were prepared from 
the cerebellum of P12 rats in ice-cold solution containing (mM) 124 NaCl, 26 
NaHCOs, 1 NaH2POg, 2.5 KCl, 2 MgCls, 2-2.5 CaCl, 10 glucose, bubbled with 
95% O2/5% COz, pH 7.4, as well as 1 mM Na-kynurenate to block glutamate recep- 
tors. Slices were then incubated at room temperature (21-24 °C) in the same solu- 
tion until used in experiments. Cerebellar slices from P10-17 mice were prepared 
in ice-cold solution containing (mM) 87 NaCl, 25 NaHC03, 1.25 NaH2PO,, 2.5 KCl, 
7 MgCl, 0.5 CaCly, 25 glucose, 75 sucrose, 1 Na-kynurenate and then transferred 
to the same solution at 27°C and allowed to cool naturally to room temperature. 
Only 1 cell was recorded from in each slice. 

Cell identification and electrophysiology. Oligodendrocytes, cerebellar granule 
cells and hippocampal pyramidal cells were identified by their location and mor- 
phology. All cells were whole-cell clamped with pipettes with a series resistance 
of 8-30 MQ. Electrode junction potentials were compensated. I-V relations were 
from responses to 200 ms voltage steps. Unless otherwise indicated, cells were 
voltage-clamped at —74mV. 

External solutions. Slices were superfused with either bicarbonate-buffered solu- 
tion containing (mM) 124 NaCl, 2.5 KCl, 26 NaHCOs3, 1 NaH»POg, 2-2.5 CaCh, 
1 MgCh, 10 glucose, pH 7.4, bubbled with 95% O and 5% CO, or with HEPES- 
buffered solution containing (mM) 144 NaCl, 2.5 KCl, 10 HEPES, 1 NaH»POu, 
2-2.5 CaCl, 1 MgCh, 10 glucose, pH set to 7.3 with NaOH, bubbled with 100% 
O>. During experiments when NMDA was applied and ion concentration changes 
were observed with ion-sensitive dyes, MgCl, was omitted from the solution to 
minimize the Mg”* block. For experiments involving Gd** and La**, the HEPES- 
based solution was used and NaH2PO, was omitted. To simulate ischaemia we 
replaced external O, with No, and external glucose with 7 mM sucrose, added 
2 mM iodoacetate to block glycolysis, and 251M antimycin to block oxidative 
phosphorylation*!. All ischaemia experiments were done at 33-36 °C, while appli- 
cations of NMDA and of TRP channel agonists were at 24°C. Control and drug 
conditions were interleaved where appropriate. For calcium imaging experiments 
when applying ischaemia solution to brain slices from transgenic mice, the exper- 
imenter was blind to the genotype. 

Intracellular solutions. Cells were whole-cell clamped with electrodes containing 
either Cs- (to improve voltage uniformity) or K-gluconate-based solution, com- 
prising (mM) 130 Cs-gluconate (or K-gluconate), 2 NaCl, 0.5 CaCl, 10 HEPES, 
10 BAPTA, 2 NaATP, 0.5 Na,GTP, 2 MgCl, 0.5 K-Lucifer yellow, pH set to 7.2 with 
CsOH or KOH (all from Sigma). The K*-based solution was used for current- 
clamp experiments. For Ca”! imaging experiments, BAPTA was decreased to 
0.01 mM and replaced with 10 mM phosphocreatine, added CaCl, was reduced 
to 104.M, and Lucifer yellow was replaced with 1 mM Fura-2, or 200|1.M Fluo-4 
with 50,1M Alexa Fluor 594, or 200 1M X-Rhod-1 with 50\.M Alexa Fluor 488 
(all from Molecular Probes) to allow ratiometric imaging. For imaging pH, Lucifer 
yellow was replaced with BCECF (961M) and the HEPES concentration was 
decreased to 0.5mM. This [HEPES] was also used for control experiments when 
examining the effect of 50 mM internal [HEPES] on the ischaemia-evoked cur- 
rent; ischaemia-evoked membrane current changes were indistinguishable when 
0.5 and 10mM HEPES were used (see main text), presumably because endogenous 
pH buffering dominates at these low [HEPES] levels. For experiments where the 
pH-buffering capacity of the internal solution was increased, 68 mM K-gluconate 
and 50mM HEPES were used. When uncaging protons, 2 mM 1-(2-nitrophenyl) 
ethyl sulphate sodium salt (NPE-caged protons, Tocris) was added to the pipette 
solution and 10 mM HEPES was replaced by 30 mM Tris to prevent ultraviolet-light- 
mediated oxidation®” of HEPES (and K-gluconate was reduced from 130 to 
120mM). For Nat and K* imaging experiments, Lucifer yellow was replaced with 
1 mM of the Na‘ -sensing dye SBFI tetra-ammonium salt or of the K*-sensing dye 
PBFI tetra-ammonium salt (Molecular Probes). In some experiments MK-801 
(1 mM) was added to the internal solution to block NMDA receptors, and cells 


were depolarized to —10 mV for 10s intermittently over a 20 min waiting period 
to facilitate MK-801 block of open channels. 

Single-cell ion imaging and H* uncaging. For Fura-2, SBFI and PBFI imaging 
when applying NMDA or during ischaemia, white matter oligodendrocytes and 
grey matter granule cells were patch-clamped with pipettes containing a solution 
as described above, fluorescence was excited sequentially at 340 + 10nm and 
380 + 10 nm, and emitted light was collected at 510 + 20nm. The ratio (R) of 
the emission intensities (340 nm/380 nm), after subtraction of the background 
intensity averaged over 4 distant areas of the image, was used as a measure of 
intracellular ion concentration. Increases of ion concentration generated a fall of 
fluorescence (F) excited at 380 nm and a rise in fluorescence excited at 340 nm, 
which is plotted as AR/R in the graphs shown, with R = F349 nm/F'3g0 nm: an upward 
deflection corresponds to a rise of concentration of the sensed ion. For Fura-2, 
SBFI and PBFI, mean values of R before applying NMDA or ischaemia solution 
were 0.41 £0.05 (n=5), 1.68 + 0.14 (n= 13) and 1.86 + 1.10 (n=8) respectively. 

Fluo-4 and Alexa Fluor 594 were used in the internal solution to measure 
[Ca**]; changes ratiometrically during Ht-uncaging and most ischaemia exper- 
iments. To measure [Mg?*]; Mag-Fluo-4 was used instead of Fluo-4. Fluo-4 
(or Mag-Fluo-4) and Alexa Fluor 594 fluorescence were excited sequentially every 
2, 10 or 30 s at 488 + 10nm and 585 + 10nm, and emission was collected using a 
triband filter cube (DAPI /FITC/Texas red, 69002, Chroma). The mean ratio of 
intensities (F4gg nm/F'sgs nm) before applying NMDA or ischaemia was 0.81 + 0.09 
(n= 16) for Fluo-4 and 0.55 £0.04 (n =6) for Mag-Fluo-4. Caged-H* were 
uncaged using 380 + 20 nm light for 1 s every 2 s (repeated 30 times) interspersed 
with the above excitation wavelengths. BCECF was imaged every 30 s at 400 and 
480 nm, with emission collected using the above tri-band filter. The ratio (R) of 
the emitted light excited by these two wavelengths (F4g0 nm/Fao0 nm) Was used as a 
measure of [H*]; (mean value before ischaemia was 17.6 + 0.9, n=8) but, since 
this ratio decreases with increasing [H‘];, when plotting changes in AR/R in 
Fig. 3e and Extended Data Fig. 4 we multiplied them by —1 to produce a trace that 
increased with [H*];. 

During ischaemia, slices swelled at the time of the anoxic depolarization. When 
cells were patch-clamped with calcium dyes, the resulting movement of the cell away 
from the electrode sometimes caused [Ca”*]; oscillations within the cells. These 
oscillations did not occur if the patch pipette was removed (after 2 min to allow 
dye-filling) before the ischaemic solution was applied. Without the pipette attached 
to the cell, the time-course of the ischaemia-evoked [Ca**]j rise was the same as with 
the electrode attached, but its amplitude was 69% larger (ratio increase 0.21 + 0.03, 
n= 20 versus 0.12 + 0.02, n= 16). In some experiments (those in Fig. 4c—g and 
Extended Data Fig. 6) we therefore removed the pipette for calcium-imaging. 

Control experiments were carried out to check whether the ischaemia-evoked 

change of pH would affect our [Ca”*]; measurements. The internal solution for Ca”*- 
sensing was studied in the experimental bath that the slices usually are placed in. 
The resting ratio of Fluo-4 fluorescence to Alexa 594 fluorescence was not signif- 
icantly affected by altering the pH of the solution from 7.05 to 6.55, and this also 
did not affect the change of ratio produced by adding 200 nM Ca’* to the sensing 
solution (Extended Data Fig. 4b, c). Thus, even a 0.5 unit pH change occurring in 
the oligodendrocyte would not significantly affect the calcium dye measurements. 
AM dye loading. X-Rhod-1-AM (38 |1M) dye loading with the myelin marker 
DIOC6 into P12 cerebellar slices was performed as described previously for optic 
nerves’. Loading times ranged from 1-2h and a de-esterification period of 30 min 
at 36°C was allowed before imaging. 
Potassium electrodes. Potassium electrodes were made as described*’. Electrodes 
were pulled with a resistance of 4-10 MQ. Electrode tips were silanized by heating 
them to 250°C for 7 min while Nz and N,N-dimethyltrimethylsilylamine (Fluka) 
were gassed into the tip from the back of the electrode. The tip was then filled 
with either 6% valinomycin, 1.5% potassium tetrakis(4-chlorophenyl)borate 
(Fluka) and 92.5% 1,2-dimethyl-3-nitrobenzene (Fluka) or the pre-made potas- 
sium sensitive ionophore I - cocktail B (Fluka). The electrodes were back-filled 
with the bicarbonate-buffered external solution mentioned above (2.5mM K*), 
and attached to a sensitive high resistance electrometer (Model FD 223, World 
Precision Instruments). A reference electrode tip was placed less than 5 jum away 
from the K* electrode tip, and the voltage changes measured by it were subtracted 
from those measured with the K* electrode. [K*], was determined by calibrating 
each electrode at the end of every experiment with at least 3 different KT concen- 
trations (1, 2.5, 5, 7.5, 10 or 17.5mM). To check for cross-reactivity, the [NaCl], 
was decreased by 60 mM which led to a —2.2 0.1 mV change in voltage (n= 3), 
while a pH change from 7.3 to 6.5 led to a 0.47 + 0.22 mV change (n= 3). Both 
of these changes are much less than the 17.5+ 0.5 mV change (n= 18) seen in 
response to an increase of [K*], from 2.5 to 5 mM (which is consistent with the 
electrodes used having an average calibration slope of 60.9 + 0.9 mV (n=6) per 
tenfold change of [K*],). 
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Drugs used. Stock solutions of the following drugs were made up in water: NMDA, 
AP5, NBQX, MK801, 7-CK, TTX, PDC, IPP, CPG, SKF 96365 and RuR. (S)-MCPG 
and amiloride were made up in external solution. Carvacrol was made up in etha- 
nol. Bicuculline, bumetanide, HC 030031, A967079, flufenamic acid, capsazepine, 
FTY720-HCI, 2-APB, AITC, RN1734 and ML204 were made up in DMSO. When 
used, DMSO and ethanol were also added to control solution at the same concen- 
trations, and did not evoke [Ca?*]; changes at the concentrations used. Stocks were 
kept at —20°C apart from carvacrol, menthol, vanillin, AITC, and RuR, which were 
made up fresh on each day of use. To minimise evaporation of carvacrol, vanillin 
and menthol, lids were kept on until the solutions were used. Gd?* and La** were 
applied (as chloride salts) in bicarbonate- and phosphate-free solution to avoid 
chelation by these anions (see External solutions section earlier). 
Immunohistochemical labelling of oligodendrocytes. Cerebellar slices were 
fixed for 30 min in 4% paraformaldehyde (PFA), and incubated for 1 to 6h in 
0.1% Triton X-100, 10% goat serum in phosphate-buffered saline at 21°C, then 
with primary antibody at 4°C overnight with agitation, and then 2h or overnight 
at 24°C with secondary antibody. Primary antibodies were: anti-CC1 (mouse, 
1:300, Calbiochem OP80 monoclonal) and anti-Olig-2 (rabbit, 1:700, Millipore, 
AB9610 polyclonal). Secondary antibodies were: goat anti-rabbit Alexa Fluor 488 
or 568 (Molecular Probes, 1:1,000), donkey anti-rabbit Alexa Fluor 488 (Millipore, 
1:1,000), and goat anti-mouse Alexa Fluor 568 (Millipore, 1:1,000). 

Antibody labelling and in situ hybridization for TRPA1 and TRPV3. TRPA1 
and TRPV3 antibodies appeared to label the myelinating processes and somata of 
oligodendrocytes in rat, but the labelling was not significantly different in wild-type 
mice and mice with TRPA1 or TRPV3 knocked out (data not shown). We therefore 
turned to in situ hybridization. 

Solutions used for in situ hybridization were pretreated with 0.1% DEPC. 
Animals were perfused with PBS followed by 4% PFA. Brains were post-fixed in 4% 
PFA overnight at 4°C, cryoprotected in 20% sucrose overnight at 4 °C and frozen 
in Tissue-Tek OCT Sections (20 jum) collected onto Superfrost Plus microscope 
slides (VWR International) were hybridized at 65°C overnight with hybridization 
buffer [50% v/v deionized formamide (Sigma), 10% w/v dextran sulphate (Fluka), 
0.1 mgmlI“ yeast tRNA (Roche), 1x Denhardt's solution (Sigma) and 1 x ‘salts’ 
(200 mM NaCl, 5 mM EDTA, 10mM Tris-HCl pH 7.5, 5 mM NaH2PO,, 5mM 
Na,HPO,)] containing digoxigenin (DIG)-labelled antisense RNA probe (1:1,000). 
Sections were washed with a washing solution (50% w/v formamide, 1 x SSC, 0.1% 
Tween 20) three times at 65°C for 30 min, followed by two 1x MABT (100 mM 
maleic acid, 150 mM NaCl, pH 7.5, 0.1% Tween-20) washes at room temperature 
for 30 min each. Sections were subsequently blocked with blocking solution (2% 
w/v blocking reagent (Roche Diagnostics), 10% v/v heat-inactivated sheep serum 
(Sigma) in 1x MABT) for 1 h at room temperature and incubated with anti-DIG 
antibody conjugated with alkaline phosphatase (AP) (Roche Diagnostics, 1:1,500 
in blocking solution) at 4°C overnight. Sections were then washed in 1x MABT 
5 times for 20 min each at room temperature, followed by two 5 min washes in 
staining buffer (100 mM NaCl, 50 mM MgCh, 100 mM Tris-HCl, pH 9.5, 0.1% 
Tween-20). Development was performed at 37 °C for 24-48 h overnight with 
nitroblue tetrazolium/5-bromo-4-chloro-3-indolyl phosphate in freshly prepared 
staining solution (50% v/v staining buffer, 25 mM MgCl, 5% w/v polyvinyl alco- 
hol). Sections were washed in PBS and immunohistochemistry was performed as 
described above. The plasmids used to generate RNA probes were: IMAGE clone 
40129486 for Trpal (linearized with Clal and transcribed with T3 RNA polymer- 
ase) and IMAGE clone 40047664 for Trpv3 (linearized with Xhol and transcribed 
with SP6 RNA polymerase). In situ hybridization was repeated using at least three 
animals for each probe. 

Quantifying myelin decompaction during chemical ischaemia using electron 
microscopy. For chemical ischaemia experiments, optic nerves were dissected 
from P28 Sprague-Dawley rats and incubated for 1h at 36°C in either control 
or ischaemic solution with and without the TRPA1/V3 channel blocker ruthe- 
nium red (10|1M) or the combined presence of the TRPA1 blockers HC-030031 
(801M) and A967079 (101M). The optic nerves were then immersion fixed in 2% 
paraformaldehyde and 2% glutaraldehyde in 0.1 M cacodylate buffer overnight. 
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All samples were then post-fixed in 1% OsO,/0.1 M cacodylate buffer (pH 7.3) at 
3°C for 2h before washing in 0.1 M cacodylate buffer (pH 7.3). The samples were 
dehydrated in a graded ethanol-water series at 3 °C and infiltrated with Agar 100 
resin mix. The nerve was then cut transversely at the mid-point, blocked out and 
hardened. Ultra-thin sections were taken, 300 1m from the cut end of the middle 
of the nerve, on a Reichert Ultracut S microtome. Sections were collected on 300 
mesh copper grids and stained with lead citrate. The sections were imaged using a 
Joel 1010 transition electron microscope and a Gatan Orius camera. 

In 3 out of 4 experiments the experimenter was blinded to the drug condition 

before imaging (all 4 experiments gave similar results). One section was used from 
each nerve and eight 21.5j1m x 17.3 1m images were collected at x 8,000 magni- 
fication, four from the peripheral borders of the nerve at 0°, 90°, 180° and 270° 
positions on the section, and four covering the central portion of the nerve. In all 
experiments the image identities were then blinded before analysis, and the number 
of large separations of lamellae (decompaction) was counted. Decompaction was 
defined as a visible white inter-lamellar gap being present between at least 2 normal 
lamellae. Regions of decompaction were normally separated from each other by 
an area of compact myelin, but when most of the myelin surrounding an axon had 
separated lamellae, decompacted regions were counted at 0.5 jum intervals around 
the sheath. The number of decompacted regions was normalized to the number of 
axons per image. Some decompaction occurred even in control nerves as a result of 
the processing for electron microscopy, so we assessed drug block of decompaction 
by quantifying the ischaemia-evoked increase in decompactions seen without and 
with the drug present. Myelin g ratios were calculated as the square root of the ratio 
of the area of the axon to the area of the axon plus myelin sheath. When drawing 
lines around the axon and sheath, areas of decompaction were ignored, that is, we 
interpolated the lines from regions that were not decompacted. Axon diameter was 
calculated as (4(axon area)/x)°°. Axon vacuolization was defined as the inclusion 
of one or more large (>0.1|1m) empty membrane bound (often circular) organelles 
within the axon or periaxonal space (Fig. 4m) which may reflect rearrangement 
of internal axonal membranes or be formed from inclusion of myelin membranes 
into the axon (Fig. 4m). 
Statistics. Data are presented as mean + s.e.m. Experiments were carried out 
on brain slices from at least 3 animals on at least 3 separate days, except for a 
few experiments using expensive drugs which were done on only 2 days. Only 
1 cell was recorded from in each slice, so the numbers of cells given are also the 
numbers of slices. P values are from two-tailed Student's t-tests (for normally dis- 
tributed data, assessed using Shapiro- Wilk tests) or Mann-Whitney U tests (for 
non-normally distributed data). Normally distributed data were tested for equal 
variance (P< 0.05, unpaired F-test) and homo- or heteroscedastic t-tests were 
chosen accordingly. P values in the text are from unpaired t-tests unless otherwise 
stated. When small sample sizes (n <4) achieved P < 0.05, analysis of sample and 
effect size typically demonstrated a power for detecting the observed effect of 
80-99% (mean 92%), with two exceptions: the process data in Fig. 4c (power 78%) 
and the soma data in Fig. 4h (power 75%). For multiple comparisons within one 
experiment (usually one figure panel, but measurements of [Ca”*]; in somata and 
processes were treated as separate experiments even when plotted in the same 
figure panel), P values were corrected using a procedure equivalent to the Holm- 
Bonferroni method (for N comparisons, the most significant P value is multiplied 
by N, the 2nd most significant by N — 1, the 3rd most significant by N — 2, etc.; 
corrected P values are significant if they are less than 0.05). All statistical analysis 
was conducted using OriginLab software. 


31. Allen, N. J., Karadéttir, R. & Attwell, D. A preferential role for glycolysis in 
preventing the anoxic depolarization of rat hippocampal area CA1 pyramidal 
cells. J. Neurosci. 25, 848-859 (2005). 

32. Keynes, R. G., Griffiths, C. & Garthwaite, J. Superoxide-dependent consumption 
of nitric oxide in biological media may confound in vitro experiments. 
Biochem. J. 369, 399-406 (2003). 

33. Marcaggi, P., Jeanne, M. & Coles, J. A. Neuron-glial trafficking of NH4at and Kt: 
separate routes of uptake into glial cells of bee retina. Eur. J. Neurosci. 

19, 966-976 (2004). 
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Extended Data Figure 1 | Tests for causes of the ischaemia-evoked current. Black lines are control J-V relations. Red lines are I-V relations 
current. a, Effect of blocking various putative glutamate release in ischaemia showing the effect of a conductance decrease (left) or ofa 
mechanisms (blocker concentrations given in Supplementary Information) _ positive shift of reversal potential due to [K*], rising (right). c, Ischaemia- 
on peak ischaemia-evoked currents measured in the presence of each evoked current change for the two mechanisms in b (cf. Fig. 1g). d, Sum of 
drug and in interleaved controls (data from rat). No significant differences currents in c gives an J-V relation with a reversal potential more negative 
were measured (P > 0.20). b, Schematic showing effect of ischaemia- than Ex. e, Experimentally observed ischaemia-evoked current in 10 
evoked decrease in resting conductance (which is dominated by gx, left) oligodendrocytes with 10 mM internal HEPES (difference of curves in 
and ischaemia-evoked [K*], rise (right) on oligodendrocyte membrane Fig. 1f). Error bars are s.e.m. 
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Extended Data Figure 2 | K* flux changes generate the oligodendrocyte 
NMDaA-evoked current. a, b, Extracellular Cs* (30 mM, replacing Na‘*) 
reduces the inward current evoked by 100 1M NMDA at —74 mV in rat 
oligodendrocytes (a), but not in hippocampal CA1 pyramidal neurons 
(b). c, d, Intracellular MK-801 (1 mM) has no effect on NMDA-evoked 
currents in oligodendrocytes (c) but blocks them in pyramidal cells (d), 
while extracellular MK-801 (501M) blocks both. e, Voltage-dependence 
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of the current evoked in 16 oligodendrocytes by 100 ,sM NMDA and by 
elevating [K*], from 2.5 to 5mM. f, Specimen plot of membrane current 
in an oligodendrocyte versus local [K*], in response to applying 100 1M 
NMDA or elevating [K*], from 2.5 to 5mM. Horizontal cell line shows 
that if NMDA raises [K*], to (say) 4.5mM, the current attributable to the 
[K*] rise alone is 51% of the NMDA-evoked current. Mean value in 11 
cells was 49% (see Supplementary Information). Error bars are s.e.m. 
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Extended Data Figure 3 | NMDA-evoked ion concentration changes in 
neurons differ from those in oligodendrocytes. a, Specimen records of 
rat cerebellar granule cell membrane current and background-subtracted 
fluorescent dye ratio (R, see Methods) when measuring [Ca?*]; with 

Fura-2, [Na*]; with SBFI, and [K*]; with PBFI, when 100j.M NMDA was 


applied. The rise of [K*]; seen reflects K* entry: [K* ] pipette was 32.5 mM, 
so Ex > —60 mV for [K*], > 3.3 mM. b, Mean peak fluorescence change 
normalized to evoked current (number of cells on bars; P value from 
Mann-Whitney test). Oligodendrocyte data for comparison are shown in 
Fig. 2. Error bars are s.e.m. 
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Extended Data Figure 4 | Comparison of NMDA-, [K*]patn- and 
ischaemia-evoked changes of [H*];. a, Measurements of changes of 
ratio (R, see Methods) of background-subtracted BCECF fluorescence in 
rat oligodendrocytes in response to 100,1M NMDA (5 cells) and raising 
(K*]bath from 2.5 to 5mM with 0.5mM internal HEPES (6 cells), and 
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to ischaemia with 0.5 mM and 50 mM internal HEPES (9 and 6 cells, 
respectively). b, c, Effect of pH of 10 mM HEPES internal solution on 
baseline ratio of Fluo-4 to Alexa 594 fluorescence (P value from Mann- 
Whitney test) (b), and change of ratio when [Ca] was increased by 200nM 
(c). Error bars are s.e.m. 
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Extended Data Figure 5 | In situ hybridization data on TRP channel cells) and CC1 (to define myelinating oligodendrocytes). TRPAI mRNA 
expression. a, In situ data for TRPA1 and TRPV3 in the cerebellum of is present (in rats and mice) and TRPV3 mRNA is present (in rats but not 
rats and mice show TRPAI messenger RNA in white matter (WM) cells mice) in myelinating oligodendrocytes (Olig2*, CC1*: arrowheads) and 
in rats and mice (with denser expression in the adjacent granule cell layer, also in some presumed oligodendrocyte precursor cells (Olig2*, CC1~: 
GCL), but TRPV3 mRNA only in white matter cells in rats. Specimen arrows). c, d, Quantification of presence of mRNA for TRPA1 (c) and 
cells are labelled with white circles. b, Higher magnification views of TRPV3 (d) in different oligodendrocyte lineage cell classes. Numbers on 
white matter, combining in situ hybridization for TRPA1 and TRPV3 bars are ‘images analysed (cells counted)’. P value from Mann-Whitney 
with immunocytochemistry for Olig2 (to label oligodendrocyte lineage test. Error bars are s.e.m. 
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Extended Data Figure 6 | Further evidence for the identity of TRP 
channels in oligodendrocytes. [Ca**]; increase (ratio signal from Fluo-4 
and Alexa Fluor 594) in rat oligodendrocyte somata when ischaemia 
solution was applied with the following drugs present (data normalized 

to interleaved controls, shown as black bars): RN-1734 (0.5 mM), which 
blocks TRPV4 and, less well, TRPV1, TRPV3 and TRPM8; a cocktail of 
blockers inhibiting (see Supplementary Information section on Specificity 
of drugs acting on TRP channels section) TRPP2, TRPC3, TRPC4, TRPC5, 
TRPC6, TRPC7, TRPM2, TRPM4, TRPM5, TRPV1, TRPV2, TRPM7, 
TRPMB8 and TRPP1, as well as the store-operated calcium channel 
component STIM1 and some voltage-gated calcium channels; blocking 
voltage-gated Ca”? channels with 101M benidipine; or blocking reversed 
Na/Ca exchange with 10 |1M KB-R7943 mesylate (P values, from Mann- 
Whitney test and t-test as appropriate, were non-significant (P > 0.28)). 
Error bars are s.e.m. 
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Mitofusin 2 maintains haematopoietic stem cells 
with extensive lymphoid potential 


Larry L. Luchsinger!?, Mariana Justino de Almeida!?, David J. Corrigan!*, Melanie Mumau!? & Hans-Willem Snoeck!?* 


Haematopoietic stem cells (HSCs), which sustain production of 
all blood cell lineages’, rely on glycolysis for ATP production”, 
yet little attention has been paid to the role of mitochondria. Here 
we show in mice that the short isoform of a critical regulator 
of HSCs, Prdm16 (refs 4, 5), induces mitofusin 2 (Mfn2), a 
protein involved in mitochondrial fusion and in tethering of 
mitochondria to the endoplasmic reticulum. Overexpression 
and deletion studies, including single-cell transplantation assays, 
revealed that Mfn2 is specifically required for the maintenance 
of HSCs with extensive lymphoid potential, but not, or less so, 
for the maintenance of myeloid-dominant HSCs. Mfn2 increased 
buffering of intracellular Ca?*, an effect mediated through its 
endoplasmic reticulum-mitochondria tethering activity®’, thereby 
negatively regulating nuclear translocation and transcriptional 
activity of nuclear factor of activated T cells (Nfat). Nfat inhibition 
rescued the effects of Mfn2 deletion in HSCs, demonstrating that 
negative regulation of Nfat is the prime downstream mechanism of 
Mfn2 in the maintenance of HSCs with extensive lymphoid 
potential. Mitochondria therefore have an important role in 
HSCs. These findings provide a mechanism underlying clonal 
heterogeneity among HSCs*"!! and may lead to the design of 
approaches to bias HSC differentiation into desired lineages after 
transplantation. 

Within the haematopoietic system, the transcriptional co- 
regulator, Prdm16, is expressed selectively in HSCs and its dele- 
tion severely impairs HSC maintenance**. However, its molecular 
targets remain unknown. We observed that in Prdm16~/~ HSCs 
(Lin -Scal*Kit'CD48~CD150*Flt3~, Extended Data Fig. 1) and 
mouse embryonic fibroblasts (MEFs) mitochondria were fragmented 
(Fig. la and Extended Data Fig. 2a). Mitochondria undergo dynamic 
fusion and fission’*!’. Fusion is driven by the outer membrane 
GTPases, mitofusin (Mfn) 1 and 2, and by the inner membrane protein 
optic atrophy 1 (Opal), while fission requires dynamin-related protein 
(Drp1)'!3. Culture of Prdm16~/~ MEFs in the presence of the Drp1 
inhibitor, mDivil (ref. 14), restored mitochondrial length (Extended 
Data Fig. 2b, c), suggesting a fusion defect, which was further doc- 
umented in a mitochondrial fusion assay (Extended Data Fig. 2d). 
Expression of Mfn2 protein and of Mfn2, but not Mfn1, messenger 
RNA was lower in Prdm16-deficient MEFs and HSCs compared to 
wild type (Fig. 1b-d and Extended Data Fig. 2e). Furthermore, mito- 
chondria were similarly fragmented in Prdm16-/~ and in Mfn2~'~ 
MEFs (Extended Data Fig. 2f) and mitochondrial length was restored 
after lentiviral expression of Mfn2 in Prdm16~'~ MEFs (Fig. le, f and 
Extended Data Fig. 2g), suggesting that Mfn2 is a target of Prdm16. 

Prdm16 exists in two isoforms arising from distinct transcrip- 
tion start sites, full-length (fl) and short (s) Prdm16, which lacks 
the amino-terminal PR-domain (Extended Data Fig. 3a)!>'®. Only 
sPrdm16, but not flPrdm16, activated a Mfn2 promoter luciferase 
reporter (Extended Data Fig. 3b, c), and induced Mfn2 mRNA in 
Prdm16'~ MEFs (Fig. 1g). Consistent with these findings, chromatin 


immunoprecipitation in MEFs using Flag-tagged isoforms of Prdm16 
showed binding of sPrdm16, but not of flPrdm 16, to the Mfn2 promoter 
(Extended Data Fig. 3d)”. Mfn2 is therefore a direct target of sPrdm16. 
Although deletion of Prdm16 did not affect Mfn1 (Fig. 1b, c), trans- 
duction of sPrdm16 did increase Mfn1 mRNA expression (Fig. 1g). 
Mfn1 is therefore susceptible to regulation by sPrdm16, but with a 
higher and probably unphysiological threshold. Lentiviral transduc- 
tion of Mfn2 did not rescue the competitive repopulation defect of 
Prdm16*'— HSCs, however (Extended Data Fig. 3e), indicating that 
multiple components of the sPrdm16 and flPrdm 16 transcriptional 
program are required for HSC maintenance. 

Induction of Mfn2 by Prdm16, a critical regulator of HSCs, sug- 
gests a role for Mfn2 in HSC function. We therefore assessed mito- 
chondrial length and Mfn2 expression in haematopoietic cells. HSCs 
display clonal heterogeneity in their differentiation potential ranging 
from rare lymphoid-biased HSCs, to balanced myeloid/lymphoid and 
myeloid-dominant HSCs with low lymphoid potential*"!!. Although 
the underlying mechanism is unknown and neither functional nor 
phenotypic classifications are absolute, myeloid-dominant HSCs are 
enriched in the CD150", while HSCs with extensive lymphoid poten- 
tial are enriched in the CD150" fraction'*'!°. HSCs expressed more 
Mfn2 mRNA (Fig. 2a) and protein (Fig. 2b) than more mature popu- 
lations. Within the HSC compartment, CD150'° HSCs expressed more 
Mfn2 mRNA (Fig. 2a) and protein (Fig. 2c) than did CD150" HSCs. In 
contrast, Mfn1 did not show HSC-selective expression, and its expres- 
sion in CD150'° HSCs was tenfold lower than that of Mfn2 (Fig. 2a). 
In accordance with Mfn2 induction by sPrdm 16, sPrdm16 was the 
predominant Prdm16 isoform in CD150"° but not in CD150"' HSCs 
(Fig. 2d). Using mice expressing a mitochondrially targeted Dendra2 
fluorescent protein (Pham mice)”°, we observed longer mitochondria 
in HSCs compared to other haematopoietic populations, and within 
the HSC compartment, in CD150" than in CD150"' cells (Fig. 2e and 
Extended Data Fig. 4a, b). Mitochondrial length therefore paralleled 
Mfn2 expression. 

As these findings suggested a subpopulation-specific role for 
Mfn2 in HSCs, we examined mice with conditional deletion of 
Mfn2 (ref. 21) in the haematopoietic system (Mfn2ius Vav-Cre). 
The frequency of progenitors in the bone marrow and thymus and 
of mature populations in blood and spleen were similar in Mfn2"- 
Vav-Cre mice and Mfn2 littermates (Extended Data Table 1). The 
Lin-Scal*Kit*CD48~CD150+ HSC compartment in Mfn2“!"Vav-Cre 
mice showed mitochondrial fragmentation (Extended Data Fig. 5a, b), 
was smaller (Extended Data Table 1) and expressed more CD150 
(Extended Data Fig. 5c) compared to that of Mfn2/!" mice, indicat- 
ing a loss primarily of CD150'° HSCs. Competitive repopulation 
studies” showed a further increase in CD150 expression within the 
donor HSC compartment (Extended Data Fig. 5d, e) and a defect 
in long-term lymphoid repopulation in recipients of Mfn2"_ Vav- 
Cre adult bone marrow (Fig. 3a) and fetal liver cells (Extended Data 
Fig. 5f). A decrease in myeloid repopulation was noted, but did not 
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Figure 1 | Prdm16 induces Mfn2. a, Mitochondrial morphology 

of MTR-stained wild-type (WT) and Prdm16~'~ fetal liver HSCs 

(Lin Scal *kit*CD48~ CD150* Flt3~) (scale bars, 5 j1m) and MEFs (scale 
bars, 201m; inset, 51m). b, Relative mRNA expression of Prdm16, Mfn1 
and Mfn2 in WT and Prdm16~/~ MEFs. n= 3 independent experiments; 
*P < 0.05; two-tailed Student's t-test. c, Relative mRNA expression 

of Prdm16, Mfn1 and Mfn2 in WT and Prdm16*'~ adult HSCs. n=3 
biological replicates; *P < 0.05; two-tailed Student’s t-test. d, Representative 
immunofluorescence for Mfn2 and Tomm20 in WT and Prdm16~'~ HSCs 
(scale bar, 541m) and Mfn2 quantification normalized to Tomm20. Bars, 
mean +s.e.m.; n > 10 fields of cells from two biological replicates; *P < 0.05; 
two-tailed Student’s t-test. e, Mitochondrial morphology in Prdm16~/~ 
MEFs co-transduced with Mito-dsRed and Mfn2 or control lentivirus (EV, 
empty vector) (scale bar, 20|1m). f, Mitochondrial length profiles in WT 
and Prdm16~'~ MEFs transduced with EV or Mfn2 (>12 cells and >80 
mitochondria from 2 biological replicates). Note that Mfn2 overexpression 
in WT MEFs causes aggregation of mitochondria and apparent shortening, 
as reported previously~’. g, Relative mRNA expression in Prdm16~/~ MEFs 
of Prdm16, Mfn1 and Mfn2 72h after retroviral expression of sPrdm16 and 
flPrdm16. n=3 biological replicates; *P < 0.05; one-way analysis of variance 
(ANOVA) with Dunnett’s post-hoc test. 


reach statistical significance (Fig. 3a and Extended Data Fig. 5f). 
Lentiviral overexpression of Mfn2 in wild-type HSCs yielded recip- 
rocal results (Extended Data Fig. 6a-i). As these phenotypic analyses 
and transplantation experiments suggested selective requirement for 
Mfn2 in the maintenance of HSCs with extensive lymphoid potential, 
we performed competitive single HSC transplantation studies to rigor- 
ously determine clonal variation in differentiation potential. Although 
out of >100 recipients too few mice were reconstituted to statistically 
assess HSC frequency, among recipients with >0.1% donor contribu- 
tion most Mfn2!'_ Vav-Cre HSCs were myeloid-dominant, whereas 
most Mfn2!S" HSCs were balanced or lymphoid 8 weeks after trans- 
plantation. In mice that still showed repopulation after 13 weeks, only 
myeloid-dominant HSCs were detected recipients of Mfn2™". Vav-Cre, 
while most donor HSCs had extensive lymphoid potential in recipients 
of Mfn2"" cells (Fig. 3b). To more accurately determine HSC frequen- 
cies we performed limiting dilution experiments”. Among Mfn2/“_ 
Vav-Cre HSCs overall repopulating HSC frequency was decreased four- 
fold compared to Mfn2!" HSC (Fig. 3c and Extended Data Table 2). 
The frequency of HSCs capable of >1% long-term lymphoid 
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Figure 2 | Mitochondrial morphology and Mfn2 in HSCs. a, Expression 
of Mfn2 and Mfn1 mRNA in HSCs, progenitors (MPP, multipotential 
progenitors, Lin" Scal*kit*CD1507 Flt3*; CMP, common myeloid 
progenitors, Lin’ Scal~ Kit*; CLP, common lymphoid progenitor, 

Lin” Scal*kit'°Flt3*IL7Rat and Lin, lineage* cells). n =3 biological 
replicates; *P < 0.05; one-way ANOVA with Dunnett’s post-hoc test. 

b, Mfn2 immunofluorescence in Pham-reporter* HSCs and CMPs (Pham 
mice express a mitochondrially targeted Dendra2 fluorescent protein’) 
(scale bar, 54m). c, Immunofluorescence staining for Mfn2 and CD150 

in CD150i and CD150'"° LSKCD48~ HSCs (scale bar, 5 jum). Cell nuclei 
counterstained with DAPI. d, Quantification of flPrdm16 and sPrdm16 
mRNA in HSCs, progenitors and Lin* cells. n=3 biological replicates; 

*P < 0.05; one-way ANOVA with Dunnett’s post-hoc test. e, Mitochondrial 
length frequency profile in Pham-reportert HSCs, progenitors and Lint 
cells (n > 15 fields from three biological replicates). 


reconstitution was also approximately fourfold lower. However, the 
decrease in the frequency of HSCs capable of >1% myeloid reconsti- 
tution did not reach statistical significance (Fig. 3c and Extended Data 
Table 2). Taken together, these results indicate that Mfn2 is required for 
the maintenance of HSCs with extensive lymphoid potential. 

Next, we identified the mechanism of action of Mfn2. Apoptosis 
and reactive oxygen species production were similar in Mfn2/“_ 
Vav-Cre and Mfn2!" HSCs, but Mfn2". Vav-Cre HSCs and progeni- 
tors displayed increased expression of Grp78, a marker of endoplasmic 
reticulum stress, which has been shown to be associated with Mfn2 
deletion?’ (Extended Data Fig. 7a-c). Mfn2, but not Mfn1, also tethers 
mitochondria to the endoplasmic reticulum, thereby enhancing intra- 
cellular calcium buffering®. Indeed, intracellular Ca2+ was increased 
in Mfn2™". Vav-Cre compared to Mfn2!" HSCs (Fig. 4a), in CD150" 
compared to CD150'° HSCs (Fig. 4b) and in Prdm16~/~ compared to 
wild-type LSK cells (Fig. 4c), but was decreased after lentiviral trans- 
duction of wild-type HSCs (Extended Data Fig. 7d). However, Mfn2 
did not affect ATP- or SDF1-induced intracellular Ca** transients 
(Fig. 4a—c and Extended Data Fig. 7d). At variance with these data 
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a Total Myeloid B cell T cell Figure 3 | Role of Mfn2 in HSC function. a, Donor 
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280] = so} ® eel. 1s 80} eas® 3 transplantation of 2 x 10° Mfn2™" or Mfn2™" Vav-Cre 
£60 <i aA. 60 == ie cof A 2 (Mfn2~'~) adult bone marrow cells together with 
5 = —— wa - sol SF 4a aa 3 2 x 10° CD45.1* competitor bone marrow cells into 
z 40 40 . = “40 2 CD45.1*CD45.2* recipients. Plots, mean + s.e.m.; 
> 20 aa 20 aa 0 ae 20 ais 5 n=7-8 recipients from two independent transplants; 
a *P < 0.05; two-tailed Student's t-test. b, Donor GM/B+-T 
ry q BY ¢ s ss ¢ 3° Pa reconstitution ratios 8 (left) and 13 (right) weeks after 
we we wo wf wf transplantation of single HSCs from Mfn2" or Mfn2. 
b A waake. 413: weeks c Cell dose Cell dose Cell dose Vav-Cre (Mfn2~/~) mice. Plots, mean +s.e.m.; n> 8 
oy O40, 80, 120160 0 40,80 120 160 0 40 80 120 169 recipients; *P < 0.05; Student's t-test. Colours represent 
a 8 a P= 0.08 myeloid-biased (blue), balanced (pink) and lymphoid 
2 3 dominant (green) as defined in ref. 11. c, Limiting dilution 
514 g assay (see Methods) with Mfn2"" or Mfn2"". Vav-Cre 
5 Ceri ee ee ee (Mfn2~/~) adult bone marrow HSCs co-transplanted 
= with CD45.1* competitor bone marrow cells analysed 
6:01 2 ° for total, myeloid or lymphoid potential 15 weeks after 
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and with a previous report showing only delayed removal of intra- 
cellular Ca”* (ref. 6), we found both lower baseline intracellular Ca7* 
and lower amplitude of ATP-induced calcium flux in Mfn2~/~ MEFs 
(Extended Data Fig. 7e). Thus, despite cell type-specific differences 
in calcium homeostasis, Mfn2 negatively regulates intracellular Ca?*. 

Sustained increase in intracellular Ca** activates calcineurin, which 
dephosphorylates four isoforms of Nfat and promotes their transloca- 
tion to the nucleus”, where they orchestrate multiple processes**”°. 
We therefore examined whether Mfn2 inhibits Nfat. As measured by 


Myeloid 


° 


transplantation. Plots, log% negative responders; n =2 
independent experiments with 4-5 recipients each per cell 
dose; frequencies calculated using limiting dilution analysis 
Poisson distribution; P< 0.05; Pearson's chi-squared test. 


immunofluorescence, Nfat1 nuclear localization was decreased after 
lentiviral transduction of wild-type HSCs (Extended Data Fig. 7f), 
and increased in Mfn2!/'_ Vav-Cre compared to Mfn2//' HSCs 
(Fig. 4d) and in Mfn2~'~ compared to wild-type MEFs (Extended 
Data Fig. 7g), which was confirmed by cellular fractionation followed 
by western blot (Fig. 4e). Consistent with inhibition of Nfat nuclear 
localization by Mfn2, the fraction of nuclear Nfat was also higher 
in CD150" compared to CD150!° HSCs (Fig. 4f) and in Prdm1 6! 
compared to wild-type HSCs (Fig. 4g). We next assessed the effect of 


a b c Figure 4 | Mechanism of action of Mfn2. 
35 100 : 30), 0.65) — ; 8°) -*,  a-c, Calcium flux trace (left) and baseline 
oe Tra = 80 = a3 006 Pra = 60 -& intracellular Ca?* (right) in Mfn2™" or Mfn2// 
© 20 Eo; £ =,lt & = 40{ | °  Vav-Cre (Mfn2~!~) HSCs (a), CD150" and CD150° 
8 107 g 40) 8 a sg 10:85 S 20 > HSCs (b), and WT and Prdm16~'~ HSCs (c). Plots, 
= 054 sph —Mmnamm 20 = eel eee +s.e.m.;n=3 biological replicates; *P < 0.05; 
5] spr — Minat 7 05 mean +s.e.m.; 1 =3 biological replicates; .05; 
2 7e0 y ey 0 e © ON op an &  — two-tailed Student's t-test. d, Nfat1 staining (left; 
100 f ¢ 0 20 40 60 80 SS \ 
Time (s) RSS Time (s) oS Time (s) & scale bar, 5m) and percentage of nuclear Nfat1 
(right, *P < 0.05) in Mfn2™" or Mfn2™"" Vav-Cre 
dq somfnatn Min2~~ __ 80 — e a . (Mfn2~! ~) HSCs. Plot, mean +s.e.m.; n > 7 fields of 
§ fol hE Sa —s- 5 3 cells from two biological replicates; *P < 0.05; 
5 ! + Bo 0.2 § two-tailed Student’s t-test. e, Subcellular 
2 40 “a : re 2 04 § fractionation followed by western blot for Nfatl in 
< Bool a ae a WT and Mfn2~'~ MEFs (left, see Supplementary 
$ é ° ® oo os & Fig. 1 for full scan) and relative quantification of 
Famine) es 8 sg” “8 nuclear and cytoplasmic fraction (right). Bars, 
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transplants; *P < 0.05; one-way ANOVA with 
Dunnett’ post-hoc test. j, Schematic representation 
of the mechanism of action of Mfn2. 


Mfn2 on Nfat transcriptional activity. Transfection of Mfn2~'~ MEFs 
with a Nfat-responsive luciferase reporter revealed increased Nfat 
transcriptional activity compared to wild-type MEFs (Extended Data 
Fig. 7h). Conversely, overexpression of Mfn2 in NIH-3T3 fibroblasts 
reduced Nfat-responsive luciferase reporter activity to the same extent 
as VIVIT, a membrane-permeable peptide that specifically inhibits the 
interaction between calcineurin and Nfat (Fig. 4h)’”. However, Mfn1 
and a dominant negative DRP1 mutant (Extended Data Fig. 7i) had no 
effect (Fig. 4h). Unlike Mfn2, Mfn1 and Drp1 only regulate mitochon- 
drial fusion and fission. These findings therefore suggest that Mfn2 
inhibits Nfat through its endoplasmic reticulum-mitochondria teth- 
ering activity rather than through mitochondrial fusion. Finally, and 
consistent with Mfn2 induction by sPrdm 16, lentiviral transduction 
of Prdm16~/~ HSCs with sPrdm16, but not with flPrdm16, reduced 
Nfat nuclear localization (Extended Data Fig. 7j), while Nfat transcrip- 
tional activation was increased in Prdm16-'~ MEFs and normalized by 
transfection of either sPrdm16 or Mfn2, but not flPrdm16 (Extended 
Data Fig. 7k). We conclude that sPrdm16-mediated induction of Mfn2 
inhibits Nfat activity. 

To directly examine the role of Nfat downstream of Mjn2, we inhib- 
ited its function in HSCs. Culture of wild-type HSCs in the presence 
of VIVIT peptide increased expression of CD150'° HSC-associated 
lymphoid commitment markers, [7r and Sox4 (ref. 18; Extended 
Data Fig. 8a), and increased lymphoid/myeloid repopulation ratio after 
subsequent competitive transplantation (Extended Data Fig. 8b). Most 
importantly, lentiviral VIVIT-GFP”’ transduction fully rescued the 
long-term lymphoid reconstitution defect of Mfn2/"- Vav-Cre HSCs 
(Fig. 4i) and, similar to Mfn2 transduction, increased lymphoid repop- 
ulation of wild-type HSCs (Extended Data Fig. 8c). Nfat inhibition is 
therefore the prime mechanism downstream of Mfn2. 

The observation that Mfn2, induced by sPrdm 16, maintains HSCs 
with extensive lymphoid potential by negatively regulating calcineurin/ 
Nfat activity through enhanced intracellular Ca** buffering 
(Fig. 41) identifies such HSCs, which decline with age?89, as a mech- 
anistically defined subset and provides a mechanism underpinning 
clonal heterogeneity in HSCs. These findings may lead to the design 
of approaches to bias HSC differentiation into desired lineages after 
transplantation. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 

Animals. C57BL/6J mice (CD45.2) and B6.SJL-Ptprca?*P?>/B°y) (CD45.1) were 
purchased from The Jackson Laboratory (Bar Harbour, ME). Prdm16S(08167423)Lex 
knockout mice*! were obtained from Lexicon Genetics. Conditional Mito- 
Dendra2 transgenic (Pham) mice*? (B6;129S-Gt(ROSA)26Sort™(CAG-COX8A/ 
Dendra2)Dec/J) and E2A-Cre mice* (B6.FVB-Tg(Ella-cre)C5379Lmgd/J) were pur- 
chased from Jackson Laboratory. Pham mice contain a mitochondrially targeted 
Dendra preceded by a stoplox sequence in the Rosa locus. These mice were crossed 
with E2A-Cre mice to effect ubiquitous induction of the MitoDendra2 reporter. 
Conditional Mfn2 knockout mice** (B6/129SE1M™tm3Pcc/Mmucd) were obtained 
from MMRRC and crossed to Vav-Cre transgenic mice*® (B6.Cg-Tg(Vav1-Cre) 
A2Kio/J) to obtain a homozygous floxed allele Mfn2 allele which generated a 
B6.Cg-Tg(Vav1-Cre)A2Kio/J;B6/126SF1 Min2tm3Dec/Mmucd mixed mouse strain. All 
mouse strains were rederived by in vitro fertilization at the Jackson Laboratory. 
Animals were housed in a specific pathogen-free facility. Experiments and animal 
care were performed in accordance with the Columbia University Institutional 
Animal Care and Use Committee. All mice were used at age 8-12 weeks, except 
in experiments that involved fetal liver cells, when E14.4 embryos were used. Both 
sexes were used for experiments. Results were analysed in non-blinded fashion. In 
all experiments, randomly chosen wild type and littermates were used. 

MEF isolation and cell lines. MEFs were established from approximately 14.5 
days post coitum embryos as previously described** from Prdm16*'~ breeder 
pairs. Briefly, dissected embryo trunks were minced into 1-2 mm fragments, 
resuspended in 3 ml 0.25% trypsin/EDTA (Gibco, Carlsbad, CA) and passed 
20-30 times through a 16 gauge needle. Cell suspensions were incubated at 37 °C 
for 1h with frequent agitation. Erythrocytes were lysed with ACK buffer, washed 
and cells were plated for 3h in 10% FBS/DMEM. Cells remaining in suspension 
were aspirated and adherent cells were cultured with fresh media. MEFs were pas- 
saged 1:3 every 3 days and cells between passage 2 and 5 were used for all experi- 
ments. 293 cells and NIH-3T3 cells were purchased from ATCC (Manassas, VA) 
and sub-cultured in 10% FBS/DMEM or 10% calf serum/DMEM, respectively. 
WT and Mfn2~'~ MEFs were a kind gift from E. Schon (Columbia University). 
Alllines are tested yearly for mycoplasma contamination and found negative. 
Plasmids. Prdm16 constructs were generated by subcloning the murine full 
length (flPrdm16) or truncated (sPrdm16) cDNA into the XholI/EcoRI sites of the 
pMSCV-IRES-GFP retroviral expression plasmid. The Mito-dsRed construct was 
purchased from Addgene (Cambridge, MA) (plasmid 11151). Mfn2 constructs 
were generated by subcloning the murine Mfn2 cDNA into the EcoRI/BamHI 
sites of the pLVX-EFla-IRES-GFP or pLVX-EF1la-IRES-mCherry lentiviral 
expression plasmid (Clontech). The pGreenFire-Nfat and pGreenFire-CMV gene 
reporter constructs were purchased from System Biosciences (San Jose, CA) and 
contained three canonical Nfat response elements (5’-GGAAAAN-3’) driving 
the expression of copGFP and luciferase reporters. The DNDrp1-pcDNA3.1 
construct was purchased from Addgene (#45161) and subcloned using the 
BamHI/EcoRI restriction sites into the pLlVX-IRES-GFP vector. Lentiviral 2nd 
generation packaging construct AR8.2 (8455) and pDM2.6 (12259) were pur- 
chased from Addgene. The —950/+22 murine MFN2 promoter was constructed 
by PCR amplification of the RP23-458J18 BAC clone (CHORI, Oakland, CA) and 
subcloned into the pGL4 luciferase reporter vector (Promega, Madison, WI). All 
cloning was carried out using KOD hot-start polymerase (Novagen, Billerica, 
MA) and subcloned for screening and sequencing into the pCR2.1 shuttle vector 
(Invitrogen, Carlsbad, CA). 

FACS sorting and analysis. For peripheral blood analyses, erythrocytes were 
lysed twice with ACK lysis buffer and nucleated cells were stained with antibody 
cocktail (Supplementary Table 1) in FACS buffer for 15 min on ice, washed and 
analysed on a BD FACSCantoll flow cytometer (Becton Dickinson, Mountain 
View, CA). For bone marrow analyses, cells were isolated using the crushing 
method and erythrocytes were lysed with ACK lysis buffer followed by 40 jim 
filtration. bone marrow cells were stained with antibody cocktail in FACS buffer 
for 30 min on ice, washed and analysed on a BD LSRII flow cytometer (Becton 
Dickinson, Mountain View, CA). Dead cells were excluded from analyses by 
gating out 7AAD-positive cells. To isolate purified haematopoietic populations, 
bone marrow cells were isolated, stained and sorted using a BD Influx cell sorter 
(Becton Dickinson, Mountain View, CA) into complete media. Data were analysed 
using FlowJo9.6 (TreeStar Inc., Ashland, OR). 

Haematopoietic stem cell transplantation. Mfn2"- Vav-Cre fetal liver cells, 
bone marrow cells or purified LT-HSCs (Lin cKit*Scal *CD48~ Flt3~> CD150*) 
were transplanted into lethally irradiated (two doses of 478 cGy over 3 h using a 
Rad Source RS-2000 X-ray irradiator (Brentwood, TN)) recipients together with 
2 x 10° competitor cells. As Mfn2"- Vav-Cre mice were not fully backcrossed 
onto the C57BL/6 background, recipient mice and competitor bone marrow cells 
were from the B6.Cg-Tg(Vav1-Cre) A2Kio/J;B6/126SF1 M!m2tm3Dec/Mmucd mixed 


background mouse strain crossed to B6.SJL-Ptprca Pep3b/Boy] (CD45.1) to 
generate a CD45.1*CD45.2* mixed background mouse. Competitor cells were 
T-cell depleted using MACS beads. For all competitive transplantation experi- 
ments, at least two independent transplants, each with at least 4 recipients per 
condition of genotype were performed, and result of all recipients pooled for 
statistical analysis. Power calculation was based on results of the first experiment. 
In limiting dilution assays, cohorts of recipients received 20 or 50 HSCs together 
with 2 x 10° competitor cells, allowing calculation of HSC frequency based on 
the number of non-repopulated mice (<1% donor contribution) using Poisson 
statistics 15 weeks after reconstitution. For Mfn2 KO single cell transplantation, 
LT-HSCs were sorted directly into complete media (StemPro34, 100 ng ml! 
SCE, 100ng ml! TPO, 50ng ml IL-6) and single cells were visually confirmed. 
Positive single cell wells were combined with 2 x 10° CD45.1 competitor bone 
marrow cells and transplanted into lethally irradiated CD45.1 recipient mice. 
Recipients showing > 0.1% CD45.2 donor contribution were considered positive 
and GM/(B+T) ratios were calculated as previously described for characterizing 
heterogeneous HSC phenotypes”. In transplantations using WT or Prdm16 /~ 
HSCs (Lin~ cKit*Scal*CD48~ Flt3~ CD150*) B6.CD45.2 cells were mixed with 
2 x 10° freshly isolated B6.CD45.1 bone marrow cells and injected via tail vein 
into lethally irradiated (two doses of 478 cGy over 3 h using a Rad Source RS-2000 
X-ray irradiator (Brentwood, TN)) B6.CD45.1+CD45.2* F1 hybrid recipients. 
After 8 to 15 weeks, peripheral blood (PB) and bone marrow were analysed. 
Lentivirus production, transduction and integration verification. Lentiviral 
particles were produced by seeding 293 cells at 7 x 10° per cm?, or PlatE cells 
(Cell Biolabs, San Diego, CA), in Ultra Culture serum-free media (Lonza, Basel, 
Switzerland) overnight followed by transfection of each packaging and expres- 
sion construct (1:1:1) using Trans-It 293 (Mirus, Madison, WI) for 2 h. Media 
were pooled after 36-48 h, clarified and concentrated by ultracentrifugation 
(100,000g), resuspended in StemPro-34 media and stored at —80°C. Virus titre 
was calculated from transduction of NIH-3T3 fibroblasts serial dilutions of the 
viral preparation. Sorted LT-HSCs were transduced with > 150 MOI lentivirus 
particles in the presence of 6,.g ml! polybrene (Sigma) and spun at 900g for 
20 min at 20°C. Supernatant was aspirated and replaced with complete media 
and cultured overnight. Transduction efficiency of cells was confirmed after 24h. 
To assess proviral copy number 15 weeks post-transplantation in vivo, spleno- 
cytes were harvested and sorted into donor (CD45.2) or competitor (CD45.1) 
populations and gDNA was isolated as previously described**. Amplification of 
the proviral WPRE region was achieved using SYBR Green qPCR assay using 
the primer pair WPREFor: 5’-CCGTTGTCAGGCAACGTG-3’ and WPRERev: 
5’-AGCTGACAGGTGGTGGCAAT-3’. Quantification of proviral copies was 
derived from the linear regression of serial dilutions of viral vector and normal- 
ized to input cell number. 

Quantitative RT-PCR. Sorted or cultured cell populations (2-5 x 10° cells) 
were lysed in TRIzol LS reagent (Invitrogen, Carlsbad, CA) and RNA was iso- 
lated according to manufacturer's instructions. cDNA was synthesized using 
Superscript III Reverse Transcriptase (Invitrogen) and target CT values were 
determined using inventoried TaqMan probes (Applied Biosystems, Carlsbad, 
CA, see Supplementary Table 2) spanning exon/exon boundaries and detected 
using a Viia7 Real Time PCR System (Applied Biosystems). Relative quantifica- 
tion was calculated using the AAC; method. To estimate relative copy number 
of Mfn1 and Mfn2 transcripts (Fig. 4a), copy numbers were derived from the 
linear regression of serial dilutions of respective cDNA plasmids and normal- 
ized to GAPDH-VIC values. To estimate relative copy number of flPrdm 16 tran- 
scripts (Fig. 4d), a probe was designed to span the SET methyltransferase domain 
of Prdm16 (exon2/3 junction) and copy number was derived from the linear 
regression of serial dilutions of respective cDNA plasmids. Another probe (exon 
14/15 junction) was used to quantify total Prdm16 copy numbers derived from 
the linear regression of serial dilutions of respective cDNA plasmids. The values 
derived from total Prdm16 probe was subtracted from flPrdm16-specific probe 
to determine sPrdm 16 transcript quantity. All values were normalized to relative 
multiplexed GAPDH- VIC values. 

LT-HSC culture. Culture of sorted LT-HSCs was carried out using StemPro34 
media (Invitrogen) supplemented with 10 mM HEPES and 50 ng ml! of recom- 
binant murine SCF, TPO, IL-6 (Peptrotech, Rocky Hill, NJ) and cultured in 
5% Op at 37 °C. In some experiments, LT-HSCs were cultured in the presence 
of 500 ng ml! VIVIT (Millipore, Billerica, MA) or 30\1M mDivil (MolPort, Riga, 
Latvia). 

Mitochondrial PEG-1500 fusion assay. To demonstrate a mitochondrial fusion 
activity, cell fusion experiments were performed using MEFs as previously 
described*’. Briefly, BacMam baculovirus constructs (Invitrogen) expressing the 
signalling peptide from cytochrome c fused to either GFP or RFP were transduced 
separately into MEF cells. Sorted GEP* and RFP* MEFs were co-cultured for 24h 
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and plasma membranes were fused using PEG-1500 (Roche, Basel, Switzerland. 
Fused cells were cultured in DMEM containing cyclohexamide (Sigma, St. Louis, 
MO) for 4h and analysed for colocalization of mitochondrial labels. 
Chromatin immunoprecipitation (ChIP). Early passage Prdm16~'~ MEFs were 
transduced with 10 MOI retrovirus for 72h and fixed with 4% paraformaldehyde 
for 10 min. Protein lysates were isolated and chromatin immunoprecipitation was 
carried out using the ChIP-IT Express Enzyme kit (Active Motif, Carlsbad, CA). 
Antibodies used for ChIP include anti-Flag and anti-TF2D. Primer probes were 
designed to span regions of the Mfn2 promoter previously shown to regulate Mfn2 
transcriptional activity (see Supplementary Table 3)**. Quantification of precip- 
itated Mfn2 promoter regions were derived from the linear regression of serial 
dilutions of bone marrow genomic DNA, normalized to input DNA concentration 
and quantifiable IgG detection was subtracted from sample values. 

Indo-1-AM calcium flux. Bone marrow was freshly isolated and lineage depleted 
with the MACS Lineage Depletion Kit (Miltenyi Biotech, San Diego, CA). Cells 
were cultured for 30 min in complete medium supplemented with 1,1M Indo-1 
prepared as stock supplemented with Pluronic-F127 and incubated at 37 °C for 
30 min. Cells were washed and stained for surface markers for 15 min, washed 
and allowed to rest in for 15min PBS in PBS with Ca*+. FACS tubes were run at 
37 °C in the sample port of the LSRII flow cytometer equipped with a 355 nm 
excitation laser. Events were collected for 40 s before incubation with 251M ATP 
or 141M SDF1 to induce calcium transients. The average ratio, R, of bound/free 
Indo-1 (405 nm/485 nm emission) before simulation was used to determine base- 
line values. Identical samples were equilibrated in 10 mM EGTA PBS without 
Ca?* to determine Rin or stimulated with 1 {1M ionomycin to determine Rynax- 
The Indo-1 dissociation constant (Kj) was assumed to be 237 nM at 37°C based 
on previous studies*’. The following equation was then used to relate Indo-1 
intensity ratios to [Ca”*]; levels; 


2+), : (R — Rmin) 
io mi (Rmax a R) 


Immunofluorescence. Sorted or cultured haematopoietic populations (2-5 x 10° 
cells) were collected in complete media and plated on onto MicroWell 96-well 
glass-bottom plates (Thermo, Waltham, MA) coated with lug ml! poly-p-lysine. 
Cells were allowed to adhere for 10 min and fixed with 4% PFA for 15 min. Cells 
were then permeabilized with 0.1% TritonX-100/PBS for 5 min and blocked with 
2% BSA/PBS for 1h at 4 °C. Cells were incubated with anti-Nfat1 (1:100), anti- 
Mfn2 (1:200), anti-tubulin (1:200), anti CD150-APC (1:100) or anti-Flag (1:250) 
(see Supplementary Table 1) overnight, washed and incubated with AlexaFluor 
secondary antibodies (Invitrogen) for 1h. Cell nuclei were counterstained with 
DAPI and mounted with fluorescent mounting media (Vector Labs, Burlingame, 
CA). Confocal images were acquired with a Zeiss LSM 700 confocal microscope 
or a Leica DMI 6000B and images were deconvoluted and processed with Leica 
AF6000 software package. 

Gene reporter assays. NIH-3T3, WT or Mfn2~/~ MEE cells were plated at 2 x 104 
cells per cm? in triplicate overnight and transfected with 500 ng of pGF-Nfat, pGF- 
CMV or —950/+22 Mfn2-pGL4 reporter construct, 500 ng of cDNA plasmids as 
indicated and 500 ng of either pSV-3Gal or pLlVX-IRES-mCherry plasmids with 
Lipofectamine 3000 according to manufacturer’s instructions for 24 or 48 h. Cells 
were lysed in reporter lysis buffer (Promega, Madison, WI) and analysed for lucif- 
erase activity using BrightGlo luciferase (Promega) and detected on a Synergy H2 
plate reader (BioTek, Winooski, VT). To visualize 3Gal activity, cell lysate was incu- 
bated in Buffer Z (Img ml-! ONPG, 0.1 M phosphate, pH 7.5, 10mM KCl, 1mM 
BME, 1mM MgSO,) at 37°C for 1h. Absorbance values were measured at 405 nm 
and used to normalize for transfection efficiency. In WT and Prdm16 ~/~ MEFs, 
gene reporter luciferase values were normalized to mCherry excitation values. 
Western blot. For total cell lysate experiments, MEF cultures were lysed in RIPA 
buffer, 50 mM Tris pH 7.5, 137 mM NaCl, 0.1% SDS, 0.5% deoxycholate and pro- 
tease inhibitors (Roche). For subcellular fractionation studies, cells were scraped, 
washed in PBS. Cell pellets were lysed in 5x packed cell volume (pcv) Buffer A 
for 10 min on ice and vortexed for 15s in the presence of 1/10 volume 3% NP-40. 
Plasma membrane lysis was verified by trypan blue staining. Lysate was spun at 
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15,000g for 10 min at 4°C and the cytoplasmic fraction was saved. The remaining 
nuclear pellet was resuspended in 2.5 pc Buffer C and incubated at 4°C for 1h 
with rotation and spun at 15,000g for 10 min. The nuclear fraction was diluted 
with 2.5 volume of Nuclear Diluent Buffer and stored at —80°C. To achieve even 
fractionation loading, equivalent percentages of nuclear and cytoplasmic fractions 
were loaded on each gel. All protein samples were denatured in 4x sample buffer 
at 95 °C and loaded onto 4—12% Bis-Tris SDS-PAGE gradient gels (Invitrogen). 
Gels were transferred onto 0.22 1m nitrocellulose membrane and stained with 
Ruby Red (Molecular Probes, Carlsbad, CA) to confirm transfer. Membranes were 
blocked with 3% non-fat milk or BSA in 0.1% Tween-20/TBS and incubated with 
anti-Mfn2 (1:200), anti-8Gal (1:1,000), anti-Nfat1 (1:250), anti-tubulin (1:1,000), 
anti-lamin A/C (1:500) and anti-3-actin (1:5,000) overnight (see Supplementary 
Table 1). Membranes were washed, incubated with HRPO-conjugated secondary 
antibodies and exposed to X-ray film (Denville) after incubation with Super Signal 
West Fempto ECL reagent (Pierce). 
Image quantification. For mitochondrial length measurements, confocal or 
deconvoluted z-stacks were collected and projected as a z-project in ImageJ 
(NIH, Bethesda, MD). Individual mitochondria were manually traced, binned 
into length categories and expressed as percent of cellular mitochondria. The 
mean +s.e.m. number of mitochondria falling into each length category collected 
from > 15 fields (30-50 cells) are expressed. For Nfat nuclear localization quantifi- 
cation, confocal or deconvoluted z-stacks were collected and a 1-\1m section in the 
centre of the cell was projected as a z-project in ImageJ. Nuclear boundaries were 
constructed using DAPI staining. The ratio of staining within the nuclear bound- 
ary to total staining was expressed as percent of Nfat signal. The mean +s.e.m. 
for > 10 fields (20-40 cells) are expressed. For immunofluorescence intensity 
measurements, confocal or deconvoluted z-stacks were collected and projected as 
a z-project in ImageJ. Thresholds were set based on IgG-stained negative control 
cells and the integrated density value of each signal per cell was recorded. The 
mean +s.e.m. for > 15 fields (30-50 cells) are expressed. 
Statistics. For statistical analysis between two groups, the unpaired Student’s 
t-test was used. When more than two groups were compared, one-way ANOVA 
was used. Results are expressed as mean + s.e.m. The Bonferroni and Dunnett 
multiple comparison tests were used for post-hoc analysis to determine statistical 
significance between multiple groups. All statistics were calculated using Prism5 
(GraphPad, La Jolla, CA) software. Differences among group means were con- 
sidered significant when the probability value, P, was less than 0.05. Sample size 
(‘r’) always represents biological replicates. Cochran test was used for exclusion 
of outliers. 

No statistical methods were used to predetermine sample size. The experiments 
were not randomized, and the investigators were not blinded to allocation during 
experiments and outcome assessment. 
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Extended Data Figure 1 | Representative sort gates for isolation of HSCs. Flow cytometric plots showing the gates used to isolate HSCs 
(Lin -Scal *kit* Flt3~CD48-) and CD150" and CD150"° HSCs. 
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Extended Data Figure 2 | Mitochondrial dynamics in Prdm16~'— 

MEFs. a, Frequency distribution (left) and frequency distribution p 


(right) of mitochondrial length in WT or Prdm16~'~ fetal HSCs. Bars, 


mean + s.e.m.; 1 > 20 fields of cells from three biological replicates; 


*P < 0.05 within length bins; two-tailed Student's t-test within length bins. 
b, Mitochondrial morphology in Prdm16~/~ MEFs treated for 24h with 
the Drp1 inhibitor, mDivi (301M), or vehicle is shown for comparison, 
Mitotracker Red staining, scale bar 201m). c, Frequency distribution 

(left) and frequency distribution profile (right) of mitochondrial length 

in WT and Prdm16~'~ MEFs treated for 24h with the Drp1 inhibitor, 
mDivi (301M), or vehicle. Bars, mean + s.e.m.; n > 16 fields from three 
biological replicates; *P < 0.05 within length bins; one-way ANOVA with 
Bonferroni's post-hoc test within length bins. d, Fluorescence micrographs 
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baculovirus expressing mitochondria-tagged GFP and RFP before 
PEG-mediated fusion (scale bar 501m). Fused mitochondria are yellow 


(arrows), and were only observed in fusions of WT cells. e, western 


visualized by Tomm20 staining 


GFP (Mfn2-GFP). Bars, mean 4 


blot (upper, Supplementary Fig. 1 for full scans) and quantification of 
western blots for Mfn2 of WT and Prdm16~'~ MEFs. Bars, mean +s.e.m.; 
n= 3 fields biological replicates; *P < 0.05; two-tailed Student's t-test. 

f, Mitochondrial morphology in WT, Prdm16~/~ and Mfn2~'~ MEFs 

red) (scale bar 201m). (G) Mitochondrial 
length in WT and Prdm16~'~ MEFs transduced with EV or Mfn2-IRES- 

t s.e.m.; n > 24 fields from three biological 
replicates; *P < 0.05 compared to Prdm16~'~ EV in each length bin; 
one-way ANOVA with Bonferroni’s post-hoc test within length bins. 
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Extended Data Figure 3 | sPrdm16 interacts with the Mfn2 
promoter. a, Schematic representation of Prdm16 protein domain 
structure. b, qPCR probe amplification scheme covering the Mfn2 
promoter used in chromatin immunoprecipitation assays. c, Proximal 
Mfn2 promoter luciferase gene reporter assay in WT MEFs transfected 
24h previously with flPrdm16, sPrdm16 or control vectors. Bars, 
mean + s.e.m.; 1 = 3 biological replicates; *P < 0.05; one-way ANOVA 
with Dunnett’s post-hoc test. d, ChIP quantification in Prdm16~'~ MEFs 
transduced with retroviral Flag—~Prdm16 constructs immunoprecipitated 
using Flag and TF2D antibodies. TFIID positive (GAPDH) and negative 
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(intergenic Chr 6) control probes are also shown. Quantification was 
performed after establishment of qPCR standard curves for all probes 
(see Methods). Bars, mean + s.e.m.; 1 =3 technical replicates 
representative tow biological replicates; *P < 0.05; one-way ANOVA with 


Dunnett's post-hoc test. e, Percentage CD45.2 (donor 


contribution in 


PB WBC in CD45.1*CD45.2* mice reconstituted with 200 transduced 
(IRES-GEP or Mfn2-IRES-GEP) Prdm16*/~ CD45.2+ HSCs and 2 x 10° 


CD45.1* competitor bone marrow cells. Plots, mean 4 
recipients pooled from four independent transplants; 
Student’s t-test. 
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Extended Data Figure 4 | Mitochondrial morphology of bone marrow b, Mitochondrial length frequency distribution in Pham-reporter* HSCs, 
populations. a, Representative images of mitochondrial morphology in progenitors and Lin’ cells. Bars, mean +s.e.m.; n > 14 fields from three 
Pham-reporter™ CD150" and CD150" HSCs (lin~Scaltkitt Flt3~ CD48"), biological replicates; *P < 0.05 within length bins; one-way ANOVA with 
MPPs (lin Scal*kit*CD48*), CMPs (lin7 Scal~kit*), CLPs Bonferroni’s post-hoc test within length bins. 


(lin~Sca1!*kit!°IL7Ra*Flt3*) and lineage* cells (scale bar, 51m). 
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Extended Data Figure 5 | Phenotype and function of Mfn2—/— HSCs. 

a, Mitochondrial morphology HSCs from Mfn2" or Mfn2""- Vav-Cre 
(Mfn2~/~) mice visualized with immunostaining for Tomm20. 

b, Frequency distribution of mitochondrial lengths (right panel, compared 
to WT in each length bin). Bars, mean +s.e.m.; n >7 fields from two 
biological replicates; *P < 0.05 within length bins; two-tailed Student’s 
t-test within length bins. c, CD150 surface staining mean fluorescence 
intensity (MFI) of HSCs (Lin~Scal*kit+CD48~) in 8-12 week old Mfn2" 
or Mfn2". Vav-Cre (Mfn2~'~) mice. n=25 biological replicates, *P < 0.05, 
two-tailed Student’s t-test. d, e, CD150 MFI of Mfn2l or M (fn Zen, 

Vav-Cre (Mfn2~! ~) donor HSCs 15 weeks after competitive 
transplantation. Plots in d, mean +s.e.m.; m > 15 mice from three 
biological replicates. f, *P < 0.05; two-tailed Student’s t-test. f, Donor 
(CD45.2) chimaerism 15 weeks after competitive transplantation of 2 x 10° 
Mfn2"" or Mfn2""" Vav-Cre (Mfn2~'~) fetal liver cells together with 

2 x 10° CD45.1* competitor bone marrow cells into CD45.1*CD45.2* 
recipients Plots, mean + s.e.m.; m = 10 mice pooled from two independent 
transplants; *P < 0.05; two-tailed Student’s t-test. 
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Extended Data Figure 6 | Analysis of competitive repopulation 
experiments and lentiviral vector transduction. a, Schematic representation 
of lentiviral constructs used. b, Representative example of transduction 
efficiency of purified HSCs with IRES-GFP and IRES-mCherry vectors 
24-48 h post-transduction. ¢c, Schematic representation of transduction 

and subsequent competitive repopulation experiments. d, Representative 
flow cytometric plots of the gates used to analyse recipients of competitive 
repopulation experiments. The CD45.1/CD45.2 gates were also applied to 
the myeloid, B and T cell gates to determine donor contribution to individual 
lineages. In this example, HSCs had been transduced with an IRES-GFP 
lentiviral vectors (efficiency 80%, see b). Although donor repopulation was 
high in the periphery (upper left panel, 71.8%) no GFP was detected (lower 
left panel), suggesting silencing of the vectors. e, Quantification of proviral 
copy number in donor-derived (CD45.2*) 15 weeks after transduction and 
competitive transplantation of HSCs. Despite silencing of the vector (see d), 
approximately 1 proviral copy was present per donor HSC-derived cell. n=3 
biological replicates; *P < 0.05; two-tailed Student's t-test. f, Representative 
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example of transduction efficiency of purified HSCs with IRES-GFP 

and Mfn2-IRES-GFP lentiviral vectors. g, Expression of Mfn2 mRNA in 
CD150" and CD150"° HSCs 24h post-transduction relative to EV control 
(left panel) and in donor-derived HSCs, CMPs and lineage* cells 15 weeks 
after competitive transplantation of 200 transduced HSCs together with 

2 x 10° CD45.1* competitor cells (right panel). The data indicate partial 
silencing of the vector in HSCs, and complete silencing in the progenitors 
and more mature cells. n = 3 biological replicates; *P < 0.05; two-way analysis 
of variance (ANOVA) with Bonferroni’s post-hoc test. h, Donor (CD45.2*) 
chimaerism 15 weeks after transplantation of 200 CD45.2* HSCs transduced 
for 24h with SIN LTR constructs pLVX-IRES-GFP or pLVX-Mfn2-IRES- 
GFP together with 2 x 10° CD45.1+ competitor fetal liver cells into lethally 
irradiated CD45.1+CD45.2° recipients mice. Plots, mean +s.e.m.; 1 > 19 
recipients from four transplantation experiments; * P < 0.05; two-tailed 
Student's t-test. i, Same experiments as in h, but using the non-SIN LTR 
vector, pHR. Plots, mean +s.e.m.; n=5 recipients; *P < 0.05; two-tailed 
Student's t-test. 
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Extended Data Figure 7 | Nfat activity in Mfn2—'~ and Prdm16—'— 
MEFs. a, Mfn2l (blue) or Mfn2l. Vav-Cre (Mfn2~'~) (red) HSCs were 
stained with DCFDA to visualize intracellular reactive oxygen species 
levels. Cells were treated with oligomycin for 15 min as a positive control. 
b, HSC apoptosis levels were analysed by surface staining of Annexin V. 
c, Intracellular staining for the endoplasmic reticulum stress response 
chaperone GRP78 in Mfn2!! or Mfn2hf. Vav-Cre (Mfn2~'~) HSCs (red 
colours) MPPs (green colours) and CMPs (blue colours). Histograms; 
representative of three biological replicates. d, Calcium flux trace (left) and 
baseline intracellular Ca?* (right) in WT LSK cells transduced with 
IRES-—GFP or Mfn2-IRES-GFP lentiviral vectors (e) and with WT or 
Mfn2-'~ MEFs stained with Indol. Bars, mean + s.e.m.; n =3 biological 
replicates; *P < 0.05; two-tailed Student's t-test. f, Nfat1 staining (left, 
scale bars, 541m) and fraction of nuclear Nfatl (right, *P < 0.05) in 

HSCs transduced with IRES-GFP or Mfn2-IRES-GFP lentiviral vector. 
Bars, mean + s.e.m.; 1 > 10 from two biological replicates; *P < 0.05; 
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two-tailed Student’s t-test. g, Nfat1 staining (left; scale bars, 20 zm) and 
fraction of nuclear Nfatl (right; *P < 0.05) in WT or Mfn2~'~ MEFs. 
Bars, mean +s.d.; n > 20 fields from three independent experiments; 

*P < 0.05; two-tailed Student's t-test. h, Nfat luciferase gene reporter 
activity in WT or Mfn2~/~ MEFs. Bars, mean + s.d.; n = 3 experiments; 
*P < 0.05; two-tailed Student's t-test. i, Mitochondrial morphology after 
transfection of dominant negative Drp1 into HeLa cells (scale bars, 541m). 
Note extreme elongation of mitochondria (arrow), confirming that this 
vector is functional. j, Nfat1 staining (left, scale bars, 5m) and fraction 
of nuclear Nfat1 (right) in HSCs lentivirally transduced with sPRrdm16 
and flPRdm16. Plots, mean +s.e.m.; n > 14 fields of cells pooled from two 
biological replicates; *P < 0.05; one-way ANOVA with Dunnett’s post-hoc 
test. k, Nfat luciferase gene reporter activity in WT and in Prdm16~/~ 
MEFs transfected with Mfn2, flPrdm16 or sPrdm16 constructs. Bars, 
mean +s.d.; 1 =3 experiments; *P < 0.05; one-way ANOVA with 
Dunnett’s post-hoc test. 
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Extended Data Figure 8 | Effect of VIVIT on WT HSCs. a, Expression 
of lymphoid (Sox4 and II7r) and myeloid/platelet (Vwf and Flil) genes in 
HSCs treated with VIVIT or DMSO for 24h. n=3 biological replicates; 
*P < 0.05; Student's t-test. b, Lymphoid/myeloid ratio of CD45.2* HSCs 
cultured for 4 days in DMSO and 500 nM VIVIT, transplanted with 

2 x 10° CD45.1*CD4°.2* competitor bone marrow cells into CD45.1* 
recipients and analysed 15 weeks post-transplant for lymphoid/myeloid 
ratio of donor compartment. Plots, mean + s.e.m.; m = 5-9 recipients 
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from two transplant experiments; *P < 0.05; two-tailed Student’s 


t-test. c, Don 


or chimaerism analysis 15 weeks after transplantation of 


CD45.2+ HSCs transduced with IRES-GFP or VIVIT-GFP with 2 x 10° 
CD45.1*CD45.2* competitor bone marrow cells into CD45.1* recipients 


Plots, mean 4 
experiments; 


t s.e.m.; 1 = 9-12 recipients from two independent transplant 
*P < 0.05; two-tailed Student’s t-test. Bottom panel shows 


lymphoid and myeloid donor contribution in each individual recipient. 
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Extended Data Table 1 | Phenotypic analysis of the haematopoietic system in Mfn2™"-Vav-Cre mice 


A 


Hematopoietic Stem Cell Compartments 


LSK CD48- Fit3- 


Bi CD150+ 
Mfn2 f/f 
0.277% 0.053% 
0.319% 0.082% 
0.346% 0.058% 
Mean 0.314% +0.03 0.064% + 0.02* 
Mfn2 f/f Vav-Cre B Peripheral blood counts 
0.213% 0.021% wr ie) 
0.575% 0.047% N= 8 4 
0.220% 0.039% wec faye 2e aa 
Mean 0.336% +0.20 0.036% + 0.01 4 % 15.4 = 92 15.4 = 12.0 
es% 73.0 + 12.8 78.2 = 12.6 
62+ 28 §9 = 32 
Cc Bone Marrow Progenitor Compartments — . 04 + 03 13214 
MPP GMP CMP MEP CLP Pe ota oF 04 = 04 
Mfn2 f/f NRBC% * * 
0.00272% 0.00184% 0.00281% 0.00049% Hematoott 45.1 * 4.2 45.3 = 67 
0.00841% 0.00469% 0.00958% 0.00067% REC 87 + 07 89 = 12 
0.00149% 0.00164% 0.00135% 0.00409% hd ae is a 7 
Mean 0.314% +0.03 0.0042% + 4e-5 0.0027% + 2e-5 0.0046% + 4e-5 0.00175% + 2e-5 pili 14.1 « 05 144 « 05 
Mfn2 f/f Vav-Cre MCHC 27.3 + 08 28.4 = 07 
0.00321% 0.00224% 0.00369% 0.00108% ROW 152 = 17 15.5 = 14 
0.00844% 0.00550% 0.00589% 0.00041% RO 72 =: OO 92a 09 
RETICS 474 = 29.9 49.1 = 42.0 
0.00136% 0.00110% 0.00142% 0.00916% maria 06 = 04 06 = 05 
Mean 0.336% +0.20 0.0043% + 4e-5 0.0029% + 2e-5 0.0037% + 2e-5 0.00355%+5e-5 | piateet 6428 = 1464 653.0 = 66.7 
D Splenocyte Lineages E Peripheral Blood Lineages 
Mac1+ or GR1+ CD3+ CD19+ Mac1+ or GR1+ CD3+ CD19+ 
Mfn2 f/f Mfn2 f/f 
6.840% 17.60000% 59.70000% 26.300% 9.75280% 54.75000% 
9.910% 27.80000% 63.70000% 23.100% 26.800% 42.135% 
8.375% 22.700% 61.700% 19.900% 43.84800% 29.51910% 
Mean 8.37% +0.02 2270%+0.05 61.70% +0.02 Mean 23.1% +0.03 268%+17 42.1% +13 
Mfn2 f/f Vav-Cre Mfn2 f/f Vav-Cre 
11.800% 13.79000% 62.59000% 15.200% 15.63765% 63.22500% 
9.070% 27.50000% 63.20000% 17.900% 35.944% 41.143% 
10.435% 20.645% 62.895% 20.600% 56.24940% 19.06100% 
Mean 10.43% +0.01 20.65%+0.07 62.90%+0.01 Mean 17.9% +0.02  35.9%+20 62.9% + 22 
F Thymus Lineages 
CD4 CD8 DP DN DN1 DN2 DN3 DN4 
Mfn2 f/f 
10.50% 2.95% 81.60% 2.11% 0.03% 0.10% 0.54% 0.19% 
11.10% 2.10% 81.30% 3.10% 0.04% 0.15% 0.81% 0.17% 
5.95% 3.01% 83.20% 1.48% 0.03% 0.10% 0.41% 0.06% 
Mean 9.2% +2.8 27% +05 82.1% + 1.0 2.2% +0.8 0.33% +0.006 0.117%+0.03 0.587%+0.20 0.14% +0.07 
Mfn2 f/f Vav-Cre 
9.42% 2.56% 83.00% 1.86% 0.03% 0.13% 0.54% 0.20% 
9.25% 2.55% 84.50% 2.09% 0.06% 0.13% 0.81% 0.20% 
11.30% 3.44% 80% 2.74% 0.03% 0.09% 0.57% 0.21% 
Mean 10.0% + 1.1 2.8%+0.5 82.5% +23 2.2% +0.5 0.040% + 0.02 0.117%+0.02 0.64%+015 0.203%+0.01 


a-f, Analysis of 8-10 week-old Mfn2™ or Mfn2""-Vav-Cre mice for (a) frequency of LSK cells and HSCs, (b) peripheral blood counts, (¢) bone marrow progenitor populations (MPPs 

(lin-Scal *kit*CD48*), CMPs (lin-Scal-kit'), CLPs (lin-Sca1!°kit!°IL7 Ra*FIt3*), MEPs (Lin-Kit*Scal~CD16/32!'CD34_)), (d) myeloid (Gr1*/Mac1*), T (CD3*) and B (CD19*) cells in the spleen, 
(e) myeloid (Gr1*/Mac1*), T (CD3*) and B (CD19*) cells in peripheral blood mononuclear cells, and (f) thymocyte subpopulations (DP: CD4*CD8*t, DN: CD4~-CD8~, DN1: CD4~CD8°CD44*CD25-, 
DN2: CD4-CD8-CD44*CD25"*, DN3: CD4~CD8-CD44-CD25*, DN4: CD4-CD8-CD44- CD25). Data are mean +s.e.m.; n=3 mice; *P< 0.05; two-tailed Student's t-test. 
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Extended Data Table 2 | Statistical analysis of limiting dilution experiments shown in Fig. 3c 


total reconstitution lymphoid reconstitution myeloid reconstitution 


Lower Estimate Upper Lower Estimate Upper Group Lower Estimate Upper 
KO 108.1 55.6 28.72 | KO 407.6 153.2 35.8 20.23 11.5 
30.3 17 9.59 | WT 69.8 38.2 18.5 8.96 4.5 


DF P.value 
1 0.0785 


DF P.value Chisq 


1 0.00957 


P.value 
0.00571 


Chisq 


3.1 


1 6.71 
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Graded Foxol activity in Tye, cells differentiates 
tumour immunity from spontaneous autoimmunity 


Chong T. Luo!?, Will Liao*, Saida Dadi, Anmed Toure! & Ming O. Li! 


Regulatory T (T,cg) cells expressing the transcription factor 
Foxp3 have a pivotal role in maintaining immunological self- 
tolerance! 5; yet, excessive Treg cell activities suppress anti-tumour 
immune responses® *, Compared to the resting Treg (rTreg) cell 
phenotype in secondary lymphoid organs, T,¢g cells in non- 
lymphoid tissues exhibit an activated Teg (aT;eg) cell phenotype””"’. 

However, the function of aT; eg cells and whether their generation 
can be manipulated are largely unexplored. Here we show that 
the transcription factor Foxol, previously demonstrated to promote 
Treg cell suppression of lymphoproliferative diseases'*'’, has an 
unexpected function in inhibiting aT,.g-cell-mediated immune 
tolerance in mice. We find that aT,,, cells turned over at a slower rate 
than rT,g cells, but were not locally maintained in tissues. aT,¢g cell 
differentiation was associated with repression of Foxol-dependent 
gene transcription, concomitant with reduced Foxo1 expression, 
cytoplasmic localization and enhanced phosphorylation at the 
Akt sites. T,.g-cell-specific expression of an Akt-insensitive Foxol 
mutant prevented downregulation of lymphoid organ homing 
molecules, and impeded Treg cell homing to non-lymphoid organs, 
causing CD8* T-cell-mediated autoimmune diseases. Compared 
to Tyeg cells from healthy tissues, tumour-infiltrating T,.. cells 
downregulated Foxol target genes more substantially. Expression of 
the Foxol mutant at a lower dose was sufficient to deplete tumour- 
associated T,¢g cells, activate effector CD8* T cells, and inhibit 
tumour growth without inflicting autoimmunity. Thus, Foxol 
inactivation is essential for the migration of aT, cells that have 
a crucial function in suppressing CD8* T-cell responses; and the 
Foxo signalling pathway in T,., cells can be titrated to break tumour 
immune tolerance preferentially. 

rTreg cells, defined by high expression of CD62L and low expres- 
sion of CD44, were abundant in lymph nodes and spleens, whereas 
CD62L'"°CD44" aT;eg cells were present in both lymphoid organs and 
non-lymphoid tissues such as the liver and lamina propria of the intes- 
tine (Extended Data Fig. 1a). To examine how Tyeg cells are maintained 
in these tissues, we connected congenically marked C57BL/6 mice 
using parabiosis (Extended Data Fig. 1b). In line with a recent study’, 
1Tyeg Cells as well as naive CD4* T cells reached chimaerism of approx- 
imately 50%, and aTyeg cells, in particular lamina propria Tye, cells, were 
skewed towards the host 2 weeks after surgery (Fig. 1a). Nevertheless, in 
contrast to liver-resident CD49a*NK1.1* cells, all T,.g cell populations 
were mixed by 4 weeks (Fig. 1a), revealing that they were not locally 
sustained for an extended period. 

Antigen-experienced conventional T cells that recirculate through 
blood, lymph and non-lymphoid tissues can be short-lived effector cells 
or long-lived effector memory cells!°. To dissect the homeostatic prop- 
erties of Treg cells, we disconnected the parabionts after 4 weeks, and 
assessed the turnover of cells originated from the non-host parabiont 
(Extended Data Fig. 1b, c). Lymph node or splenic rTyeg cells turned 
over at a rate close to that of naive CD4* T cells, witha decay halftime 
of between 4 and 7 weeks (Fig. 1b). In contrast, aT;eg cells from these 


tissues turned over at a substantially slower rate, with a half time of 
around 14 weeks (Fig. 1b). Notably, liver or lamina propria Tyeg cells 
had a comparable decay rate of between 13 and 15 weeks (Fig. 1b). 
Thus, compared to rTyeg cells, aT;eg cells from both lymphoid and non- 
lymphoid tissues turn over more slowly, resembling effector memory 
T cells. 

We wanted to determine how aTeg cell trafficking and homeostasis 
are regulated, and whether these processes can be manipulated. The 
transcription factor Foxol integrates diverse environmental signals 
to control T-cell homeostasis and differentiation'®!’. Expression of 
Foxol is essential for Tyeg cell function!*!8, but its role in aTreg and 
rTyeg cell subsets has not been defined. To this end, we performed 
gene-expression profiling experiments of splenic aT;eg and rTyeg 
subsets. By cross-referencing the differentially expressed genes and 
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Figure 1 | aT;eg cells have a slow turnover, but are not locally maintained 
in non-lymphoid tissues. a, The frequencies of non-host derived cells in 
parabiotic mice 2 or 4 weeks after surgery, including naive CD4* T cells 
(CD4*Foxp3~ CD62L"'CD44!"), rTyeg cells (CD4*Foxp3*+CD62L"CD44"") 
and aTyeg cells epeaoas euecn.e. ) in the lymph node (LN) and 
spleen, total T,eg cells (CD4*Foxp3*) in the liver and colon lamina propria 
(LP), and CD49a*NK1.1* innate lymphoid cells in the liver. n=6. 

b, Parabionts were separated 4 weeks after connection, and the percentages 
of non-host chimaerism at 2, 6, 12 and 18 weeks after separation are 
shown. ty/2 depicts the amount of time it took the population to decay to 
half of its original size. n = 4 for each time point. Data are mean + s.e.m. 
Comparison between aTyeg and rTyeg: P= 0.0025 for lymph node; 
P=0.0335 for spleen (two-way analysis of variance (ANOVA)). 


lmmunology Program, Memorial Sloan Kettering Cancer Center, New York, New York 10065, USA. @Louis V. Gerstner Jr Graduate School of Biomedical Sciences, Memorial Sloan Kettering Cancer 
Center, New York, New York 10065, USA. New York Genome Center, New York, New York 10013, USA. 
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Figure 2 | aT; cg cell differentiation is associated with downregulation 
of Foxol-dependent gene expression. a, Gene expression comparison of 
Foxol direct target genes in splenic aTyeg (CD62L'°CD44") versus TTreg 
(CD62L"'CD44"*) cells. Foxol direct target genes are defined as: 

(1) being differentially expressed between wild-type and Foxol knockout 
Treg Cells; (2) having their expression corrected by Foxo1CA; and (3) 
having Foxol recruited to the gene locus. b, c, e, Flow cytometric analysis 
of Foxol-activated target genes CCR7 and Bim (b), Foxol protein (using 


the Foxol-regulated genes”, we found that aT reg OF TTreg Cells pref- 
erentially expressed the Foxol-downregulated or -upregulated tran- 
scripts, respectively (Extended Data Fig. 2a and Supplementary Table). 
Furthermore, in reference to a Foxol direct target gene signature’’, the 
Foxol-repressed or -activated transcripts were enriched in aTyeg or tTreg 
cells, respectively (Fig. 2a and Supplementary Table). Notably, several 
Foxol-activated genes that promote lymphoid organ homing, includ- 
ing KIf2, Ccr7 and SIpr1, were highly expressed in rTreg cells, whereas 
the Foxol-repressed genes potentially involved in T-cell migration or 
retention in tissues, such as Lamcl1, Nid2 and Mmp9, were induced in 
aTreg cells (Extended Data Fig. 2b, c and Supplementary Table). In line 
with RNA-sequencing data, protein levels of two well-defined Foxo1- 
activated targets, Ccr7 and Bim, were downregulated in aTreg cells 
(Fig. 2b), supporting that Foxol-dependent gene expression is 
repressed in aTyeg cells. 

Compared to rTreg cells, aTreg cells expressed lower amounts of 
Foxol transcript and protein (Fig. 2c and Supplementary Table). 
Foxol nuclear localization and protein stability is attenuated after phos- 
phorylation by Akt!®!”, Indeed, while Foxol predominately resided 
in the nucleus of rTreg cells, it translocated to the cytoplasm of aTreg 
cells (Fig. 2d). Phosphorylation of Akt, Foxol and the mTORCI signal- 
ling pathway marker S6 ribosomal protein was increased in aTyeg cells 
(Fig. 2e and Extended Data Fig. 2d). These findings demonstrate 
that aT;eg cell differentiation is associated with activation of the Akt 
kinase, with concomitant repression of Foxol nuclear localization and 
expression. 

Inclusion of an Akt inhibitor prevented the induction of aTyeg from 
rTreg cells in vitro (Extended Data Fig. 2e). To determine the spe- 
cific role of Foxol inactivation in aTyeg cells in vivo, we used a mouse 
strain expressing a mutant form of Foxol that is refractory to Akt- 
triggered inhibition'”. Mice carrying the Rosa26-floxed stop mutant 
Foxo1 allele were bred to the Foxp3“" background to induce Treg-cell- 
specific expression of a constitutively active form of Foxol (ref. 12), 
herein designated CA. CA was expressed at increasing levels in 


green fluorescent protein (GFP) as a reporter; Foxoltag-GFP) (c), and 
phosphorylated Akt and Foxol (p-Akt, and p-Foxol) (e) in splenic aTyeg 
and rTreg cells. Mean + s.e.m. of mean fluorescence intensity (MFI), 
normalized to rT; eg cells, are shown. n = 4 (c); n= 3 (e). *P< 0.05; 

**P < 0.01; ***P < 0.001 (paired t-test). d, Inmunofluorescence staining 
of Foxol in splenic rT,eg and aTyeg cells. n = 70. Cyto, Nuc and Nuc/Cyto 
denote Foxol expression in the cytoplasm, nucleus and both the nucleus 
and cytoplasm, respectively. Original magnification, x 60. 


CA/+ (Foxp3“*’Foxo1CA/+) or CA/CA (Foxp3“*Foxo1CA/Foxo1CA) 
Treg cells, and was constitutively localized in the nucleus (Extended 
Data Fig. 3a, b). As expected, CA triggered a dose-dependent increase 
of its target gene expression (Extended Data Fig. 3c). Thymic Treg cell 
differentiation was unperturbed in CA/+ or CA/CA mice (data not 
shown). However, CD62L"° aTyeg phenotype cells in lymph nodes were 
proportionally decreased in 9-12-day-old CA/+ or CA/CA mice 
(Fig. 3a), in line with a role for Foxol in inducing CD62L expres- 
sion!??!, T-cell activation markers CD69 and ICOS were comparably 
induced among Tyeg cells from wild-type, CA/+ or CA/CA mice, 
with the exception that a higher fraction of CD62L™ Treg phenotype 
cells expressed these molecules in CA/+ or CA/CA mice (Fig. 3b and 
Extended Data Fig. 3d). Similarly, CD62L" rT,-g phenotype cells in 
CA/CA mice expressed higher levels of the activation-associated genes, 
including Cd69, Egr2 and Il1r2, than wild-type rTreg cells (Extended 
Data Fig. 3e). In addition, EdU labelling experiments showed that 
CA/CA rTreg phenotype cells had a higher proliferation rate than wild- 
type rTieg cells (Extended Data Fig. 3f). 

To dissect the effect of CA on Treg cells further, we purified 
CD62Lh'CD44"°CD69-ICOS~ rTyeg cells from wild-type, CA/+ and 
CA/CA mice, and performed in vitro culture experiments. Compared 
to wild-type rT;eg cells, the CA-expressing rTyeg cells showed no defect 
in activation, proliferation or survival, but they failed to downregulate 
CD62L (Extended Data Fig. 4). CCR7 downregulation was also atten- 
uated in CA-expressing Treg cells (Extended Data Fig. 3c), revealing 
that relief of Akt-triggered Foxol inhibition is sufficient to maintain 
high expression of molecules involved in lymph node homing and 
intranodal T-cell migration. To determine whether such alteration 
perturbed Treg cell trafficking, we intravenously administered a CD4 
antibody, which predominantly labelled cells in the red pulp or mar- 
ginal zone of spleen, but not the white pulp. Fewer CA-expressing 
Treg cells were found among the labelled red pulp/marginal zone 
fraction (Fig. 3c), whereas comparable percentages of conventional 
T cells were labelled (Extended Data Fig. 5a), suggesting that CA 
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Figure 3 | Expression of a constitutively active form of Foxol in Treg 
cells alters cell migration. a, b, Flow cytometric analysis of CD44, 
CD62L (a), and CD69, CD62L (b) in lymph node Treg cells (CD4* Foxp3*) 
from wild-type (WT), Foxp3“’Fox01CA/+ (CA/+) or Foxp3@°Foxo1CA/ 
Foxo1CA (CA/CA) mice. Quantification of rTjeg (CD62L™CD44"°) 

and aTyeg (CD62L'°CD44"") cell subsets (a), and CD69tCD62L", 
CD69*CD62L" populations (b) are shown, with percentages of indicated 
populations among total Treg cells. n = 6. c, In vivo labelling experiment: 
anti-CD4-biotin (RM4-4) was injected intravenously 5 min before 
analysis. Anti-CD4 (RM4-5) was used to label total CD4* T cells. 
Representative plots from gated Treg cells (CD4 RM4-5*Foxp3*) are 
shown. Quantification shows percentage of RM4-4* cells among Treg cells. 
n= 10 (WT, CA/CA); n=7 (CA/+). d, e, Foxp3 expression in lymph node 
(d) or liver (e) CD4* T cells. Quantification shows percentages of Treg 
cells among total CD4* T cells. n= 6 (WT, CA/CA); n=4 (CA/+). All 
mice were 9-12 days old. Data are mean + s.e.m. Results represent at least 
three independent experiments. *P < 0.05; **P < 0.01; ****P < 0.0001 
(unpaired t-test). 


prevented the migration of Tyeg cells from white pulp to red pulp/ 
marginal zone. 

rTyeg and aTyeg cells reside at different locations in secondary lym- 
phoid organs, and engage discrete mechanisms of homeostatic main- 
tenance!*, Whereas the total proportion or number of lymph node 
CD62L" Treg cells was similar or increased in CA-expressing mice 
compared to wild-type mice, CD62L"° Treg cells were reduced with 
increasing doses of CA expression (Extended Data Fig. 5b, c), result- 
ing in reduced total lymph node Tyeg cells (Fig. 3d). Thus, although 
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Foxol inactivation is not essential for Tyeg cell activation, prolifera- 
tion or survival, it is required to control the expression of trafficking 
molecules that may promote aT,2g cells to migrate away from the 
rTyeg cell niche and further expand. Indeed, CA expression from one 
or two alleles caused a dose-dependent reduction of liver Treg cells 
(Fig. 3e). 

To examine whether the aTreg cell defects were sustained beyond 
the neonatal stage, we analysed 4—6-week-old mice. Expression of 
CA triggered a dose-dependent reduction of aTyeg cells in lymphoid 
organs as well as in non-lymphoid tissues, leaving rTyeg cells less affected 
(Fig. 4a and Extended Data Fig. 6a, b). Compared to wild-type and 
CA/+ mice, all CA/CA mice succumbed to a wasting disease and death 
by 4 months of age (Fig. 4b and Extended Data Fig. 6c, d). In contrast to 
Treg- cell-specific Foxol -deficient mice!?, CA/CA mice did not exhibit 
the typical inflammatory phenotype associated with the ‘Scurfy’ muta- 
tion of Foxp3 such as tail crusting, splenomegaly or lymphadenopathy 
(Fig. 4c and Extended Data Fig. 6c, e). Expression of CA did not perturb 
the levels of the T,eg cell markers CTLA4, LAG3 and GITR, or Tyeg cell 
suppressive activity in vitro (Extended Date Fig. 6f, g). Yet, a dense 
infiltrate of leukocytes were observed in many organs of CA/CA mice, 
including the liver and colon (Fig. 4d). Serum alanine aminotransferase 
(ALT) activity, a biomarker for liver injury, as well as colon pathology 
scores were increased (Fig. 4e), revealing that the immunopathology 
had resulted in tissue damage. 

The autoimmune lethal phenotype in CA/CA mice was associated 
with activation of CD4* and CD8* T cells (Extended Data Fig. 7a, d), 
yet they produced only modestly higher or comparable amounts of 
inflammatory cytokines compared to T cells from wild-type or CA/+ 
mice (Extended Data Fig. 7b, c, e, f). Splenic CD8* T cells, however, 
expressed substantially higher levels of the cytolytic molecule gran- 
zyme B (GzmB) (Fig. 4f, g). Much enhanced expression of GzmB, but 
not IFN-+, was also observed in liver- and lamina-propria-infiltrating 
CD8°* T cells (Fig. 4f, g and Extended Data Fig. 8a, b). In addition, 
T-cell populations were skewed towards CD8* T cells in these tissues 
(Extended Data Fig. 8c), suggesting that augmented effector CD8* 
T-cell responses might trigger the immunopathology. To test this 
hypothesis, we crossed CA/CA mice to the CD8-deficient background. 
Depletion of CD8* T cells completely rescued the wasting disease and 
lethal phenotype of CA/CA mice (Fig. 4h and Extended Data Fig. 9a). 
Serum ALT levels and tissue pathology were also fully rectified 
(Fig. 4i and Extended Data Fig. 9b). Collectively, these findings suggest 
that aT; eg cells have a particularly important function in suppressing 
CD8* T-cell-mediated tissue destruction. 

Spontaneous autoimmune pathology was observed in CA/CA but 
not CA/+ mice, in line with CA dose-dependent repression of aTreg 
cells (Fig. 4a and Extended Data Fig. 6a, b). CA expression also led 
to varying degrees of T,eg cell reduction in different tissues, with the 
lamina propria more affected than the liver (Fig. 4a). In fact, the mag- 
nitude of Treg cell loss in lymphoid organs and non-lymphoid tissues 
was inversely correlated with Foxol activity, as revealed by expression 
of two direct Foxol1 target genes (Fig. 5a). 

High numbers of T;eg cells are found in tumours®*, which is associ- 
ated with poor prognosis of cancer patients”. The role of Treg cells in 
tumour immunity has been investigated primarily with methods that 
deplete Tyeg cells in all tissues?>4, To study how CA affects tumour- 
infiltrating T,eg cells, we used the MMT V-PyMT (PyMT) spontaneous 
mammary tumour model’>. Similar to Treg cells from the liver and 
lamina propria, tumour-infiltrating T,2g cells exhibited an activated 
phenotype (Extended Data Fig. 10a), yet they expressed the lowest 
level of Foxol targets (Fig. 5a), suggesting that they might be most 
sensitive to Foxol gain-of-function. To test this hypothesis, we crossed 
CA/+ mice to the PyMT background. Indeed, tumour-infiltrating Treg 
cells were diminished by 2.6-fold with CA expression from one allele 
(Fig. 5b, c), which resulted in profound inhibition of tumour growth 
(Fig. 5d). Although CD8* T cells were phenotypically indistinguish- 
able in healthy tissues from CA/+ and wild-type mice (Fig. 4f, g), 
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Figure 4 | Foxol hyperactivation in Tye, 
cells causes a CD8* T-cell-dependent 
inflammatory disease. a, The frequencies 
of Treg cells among CD4* T cells in lymph 
node, spleen, liver and colon lamina propria 
of wild-type, Foxp3“’Foxo1CA/+ (CA/+) or 
Foxp3“Foxo1CA/Foxo1CA (CA/CA) mice. 
n= 6 (WT, CA/CA); n=8 (CA/+). Numbers 
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a sae! 40 40 staining of liver and colon sections. Original 
tae 5 0 aE - = a = magnification, x20. e, Top, serum ALT activity; 
seek C1110 IO ONO 1 108 108 bottom, histological grading of colons. n=4. 
e *] 5 §66¢— LP f, g, Flow cytometric analysis of granzyme 
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Figure 5 | Tuned activation of Foxol in Treg cells results in enhanced 
anti-tumour immunity without inflicting autoimmunity. a, Flow 
cytometric analysis of Foxol-activated target genes CCR7 and Bim in Treg 
cells from lymph node, spleen, liver, colon lamina propria and tumour 

of 22-24-week-old PyYMT mammary tumour-bearing mice. Numbers 
indicate percentage of CCR7- or Bim-positive cells shown in the gate. 

b, Foxp3 expression in tumour-infiltrating CD4* T cells from PyMT or 
Foxp3“°Foxo1CA/+PyMT (CA/+PyMT) mice. c, The frequencies of Treg 
cells among CD4* T cells in tumours of PyYMT or CA/+PyMT mice. n= 8 
(PyMT); n=6 (CA/+PyMT). Number above plots indicates fold change in 
comparison to PyMT. d, Tumour growth curve of PyYMT and CA/+PyMT 
mice. n=5. e, f, Flow cytometric analysis of GzmB expression in CDst 
cells (e) and quantification (f) from tumours of PyMT and CA/+PyMT 


Post-inoculation (day) 


mice. n=6. g-l, Eight-to-ten-week-old wild-type or Foxp3“"Foxo1CA/+ 
(CA/+) mice received orthotopic inoculation of PyMT-derived mammary 
tumour cells (AT-3) (g-i), or subcutaneous injection of B16 melanoma 
cells (j-l). g, j, Tumour growth curve of wild-type and CA/+ mice. n=8. 
h, k, The frequencies of Tyeg cells among CD4* T cells in tumours of 
wild-type and CA/+ mice. Inh, n=4 (WT); n=5 (CA/+). Ink, n=6. 
Numbers above plots indicate fold change in comparison to wild type. 

i, l, The frequencies of GzmB-expressing CD8* T cells from tumours 

of wild-type and CA/+ mice. In i, n=4 (WT); n=5 (CA/+). In], 

n=5 (WT); n=6 (CA/+). Results represent at least three independent 
experiments. Data are mean +s.e.m. *P < 0.05; **P < 0.01; ***P < 0.001; 
7 P< 0.0001 (unpaired t-test (c, f, h, i, k, 1) and two-way analysis of 
variance (ANOVA) (d, g, j)). 
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tumour-infiltrating CD8* T cells from CA/+-PyMT mice expressed 
higher amounts of GzmB (Fig. 5e, f). 

Tumour progression in PyMT mice is accompanied by loss of 
genome stability”®, which induces accumulation of mutations to foster 
cell transformation. To determine whether CA expression in Treg cells 
could inhibit growth of already transformed cells, we used syngeneic 
AT-3 cells derived from PyMT mice. Orthotopic implantation of AT-3 
to the mammary fat pad resulted in aggressive tumour growth and 
lethality in wild-type mice, both of which were attenuated in CA/+ 
mice (Fig. 5g and Extended Data Fig. 10b). CA expression led to a 
6.5-fold reduction of tumour-infiltrating T,., cells (Fig. 5h and 
Extended Data Fig. 10c), which was associated with increased expres- 
sion of GzmB in tumour-infiltrating CD8* T cells (Fig. 5iand Extended 
Data Fig. 10d). Importantly, depletion of CD8* T cells by crossing 
CA/+ mice to the CD8-deficient background restored tumour growth 
(Extended Data Fig. 10e), revealing an essential function for CD8* 
T cells in tumour suppression. To investigate whether the CA effect 
was applicable to tumours of other tissue origin, we used B16 mela- 
noma cells. Similar to AT-3, B16 growth was inhibited in CA/+ mice 
(Fig. 5j), which was accompanied by an approximately tenfold reduc- 
tion of tumour-infiltrating Tyeg cells and increased GzmB-expressing 
CD8* T cells (Fig. 5k, 1). These findings reveal that tumour-associated 
Treg cells are generally more susceptible to CA-triggered depletion. 

rTreg and aTyeg cell subsets are well defined in humans and mice”!?, 
but their individual contributions to immune tolerance have been 
enigmatic. Our data reveal that Foxol repression is a major transcrip- 
tional reprogramming event associated with differentiation of aTyeg 
cells, which supports their distinct migration pattern resembling that 
of effector memory T cells. Using a Foxol gain-of-function model 
that preferentially triggers aT. loss in peripheral tissues, we could 
identify a prominent role for these cells in the control of CD8* T-cell 
tolerance. Although CD4* T cells are insufficient to induce immuno- 
pathology, aTreg cells might inhibit their helper function to suppress 
cytotoxic T-cell responses. In addition, aT;,eg cells may inhibit CD8* 
effector T cells by modulating antigen-presenting cells as shown in a 
tumour model’’. Importantly, tumour-associated aT reg cells are more 
susceptible to Foxol-triggered depletion as a likely consequence of 
higher antigen load, and effective tumour immunity can be induced 
without overt spontaneous autoimmunity. aT,2g cell differentiation is 
dependent on T-cell receptor signalling”®”? that inactivates Foxol via 
Akt. Drugs that target upstream TCR signalling molecules, as recently 
shown for phosphatidylinositol-3-OH kinase (PI(3)K)*°, may impinge 
on the Foxol pathway to break T,,g-cell-mediated tumour immune 
tolerance selectively. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 

Mice. The FoxolCA (Foxo 1444 knock-in) and Foxol tag-GFP (Foxo1'8) mouse 
models, as well as MMTV-PyMT on C57BL/6 background were previously 
described!?*>. C57BL/6, CD45.1* and CD8~/~ mice were purchased from Jackson 
Laboratory. Mice with T,.g-cell-specific expression of FoxolCA were generated by 
crossing Foxo1444 with Foxp3“ mice*!. Foxp3“ mice express yellow fluorescent 
protein (YFP) from the Foxp3 locus, and were used as a reporter for sorting exper- 
iments. To mark Treg cells with red fluorescent protein (RFP) in the Foxoltag-GFP 
experiment, Foxo1'8 mice were bred with Foxp3-IRES-RFP mice™. In all exper- 
iments, littermate controls were used when possible. Unless mentioned other- 
wise, both male and female mice were included. All mice were maintained under 
specific pathogen-free conditions, and animal experimentation was conducted 
in accordance with procedures approved by the Institutional Animal Care and 
Use Committee of Memorial Sloan Kettering Cancer Center. Investigators were 
not blinded to group allocation and outcome assessment. No statistical methods 
were used to predetermine sample size and the experiments were not randomized. 
Parabiosis. Parabiosis and separation were done as reported with 6-8-week-old 
cogenically marked female C57BL/6 mice that were matched for body weight. In 
brief, matching skin incisions were made from the elbow to the knee of each mouse. 
Forelimb and hindlimb connections were made with sutures and skin incisions 
were closed using woundclips. For separation, connections between parabionts 
were disrupted and skin incisions were closed using woundclips. Parabionts were 
maintained for 4 weeks before surgical separation. Separated parabiont mice were 
analysed up to 18 weeks after surgery. 

Tumour models. MMTV-PyMT spontaneous tumour model was previously 
described**. AT-3 model: 8-10-week-old wild-type, Foxp3“°’Foxo1CA/+ (CA/+), 
CD8~'~, or Foxp3“"FoxolCA/+CD8~/~ (CA/+CD8~'~) mice were injected with 
AT-3 mammary tumour*® (2 x 105 cells into mammary fat pad). B16 model: 
8-10-week-old wild-type or CA/+ mice were injected with B16.F10 melanoma 
(1.25 x 10° cells subcutaneously). For all tumour models, tumours were meas- 
ured regularly with a caliper. Tumour volume was calculated using the equation 
(L x W?) x 0.52, in which L denotes length, and W denotes width. The maximal 
tumour burden was 3,000 mm3, and in none of the experiments was the limit 
exceeded. For PyMT, individual tumour volumes were added together to calculate 
total tumour burden. Tumour bearing mice were euthanized at 22-24 weeks old for 
the PyMT model, 30-35 days after inoculation for the AT-3 model and 20-24 days 
after inoculation for the B16 model. In all flow cytometry experiments, tumours 
of similar sizes were used for comparison. 

Cell isolation. After whole-body perfusion with 50 ml of heparinized PBS, lym- 
phocytes were isolated as follows. Single-cell suspensions were prepared from 
spleens and peripheral (axillary, brachial and inguinal) lymph nodes by tissue dis- 
ruption with glass slides. To isolate cells from the liver, tissues were finely minced 
and digested with 1 mg ml! collagenase D (Worthington) for 30 min at 37°C. For 
lamina propria lymphocytes isolation, colon was dissected and washed in HBSS. 
Intestinal pieces were stirred in 1 mM dithiothreitol (DTT) in HBSS to release 
intraepithelial lymphocytes. The remaining intestinal tissues were finely minced 
and digested with RPMI plus 5% FBS and 1 mg ml! collagenase D for 30 min at 
37°C. For tumour-infiltrating immune cell isolation, tumour tissues were prepared 
by mechanical disruption followed by 1h treatment with 280 U ml! collagenase 
type 3 (Worthington) and 41g ml“! DNase I (Sigma) at 37°C. After the digestion 
steps, cells isolated from the liver, lamina propria and tumour were filtered through 
a70-|M cell strainer, layered in a 44% and 66% Percoll gradient (Sigma), and cen- 
trifuged at 1,600g for 30 min without brake. Cells at the interface were collected 
and analysed by flow cytometry. 

Flow cytometry. Flurochrome-conjugated, biotinylated antibodies against 
CD45.1 (clone 104), CD45.2 (A20), TCR-® (H57-595), CD4 (RM4-5), CD8 
(17A2), CD44 (IM7), CD62L (MEL-14), CD69 (H1.2F3), CCR7 (4B12), Foxp3 
(FJK-16 s), ICOS (C398.4A), IFN-y (XMG1.2), IL-4 (11B11) and NK1.1 (PK136) 
were purchased from eBiocience. Antibodies against CD49a (Ha31/8) and IL-17a 
(TC11-18H10.1) were purchased from BD Biosciences. Anti-GzmB (GB11) 
was purchased from Invitrogen. Purified antibodies against Bim (C34C5), 
p-Akt(S473) (736E11), p-Akt(T308) (C31E5E), p-Foxol(T24) and p-S6(S240) 
were purchased from Cell Signaling. All antibodies were tested with their respec- 
tive isotype controls. Cell surface staining was performed by incubating cells 
with specific antibodies for 30 min on ice in the presence of 2.4G2 monoclo- 
nal antibody to block FcR binding. CCR7 staining was incubated in 37°C for 
30 min before cell surface staining. Foxp3, Bim, GzmB, IFN-+, IL-4 and IL-17a 
staining was carried out using the intracellular transcription factor or cytokine 
staining kits from Tonbo or BD Biosciences. Phosphorylation staining was per- 
formed using BD phospho-protein kit. Secondary antibodies with flurochrome- 
conjugation were used for the staining of purified antibodies. To determine 
cytokine expression, isolated cells were stimulated with 50 ng ml“! phorbol 
12-myristate 13-acetate (Sigma), 1 mM ionomycin (Sigma) and GolgiStop 
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(BD Biosciences) for 4h before staining. Incorporation of EdU was measured 
using the Click-iT EdU flow cytometry assay kit according to the manufacturer's 
instructions (Invitrogen). Mice were injected intraperitoneally with 50j.gg 
body weight of EdU and euthanized 18 h later. For all stains, dead cells were 
excluded from analysis by means of Live/Dead Fixable Dye (Invitrogen), DAPI 
or propidium iodide stain. All samples were acquired and analysed with LSRI 
flow cytometer (Becton Dickson) and FlowJo software (TreeStar). 
In vivo T-cell labelling. Anti-CD4-biotin (RM4-4, BioLegend) (21g) was injected 
intravenously, and mice were euthanized 5 min after injection. Splenocytes were 
prepared for flow cytometric analysis as described above. Streptavidin conjugated 
with fluorophore was used as the secondary labelling for anti-CD4-biotin, and a 
non-competing clone of CD4 antibody (RM4-5, eBioscience) was used to stain 
total CD4* cells. 
In vitro T-cell culture. Akt inhibitor experiment: CD4*Foxp3*CD62L"CD44" 
1Treg cells were purified from spleen and lymph nodes of Foxp3"*”/“" mice by flow 
cytometry sorting (BD FACS Aria) at Flow Cytometry Core Facility of Memorial 
Sloan Kettering Cancer Center (MSKCC). rTreg cells were cultured with plate- 
bound anti-CD3 (coated overnight, 5j1g ml“), soluble anti-CD28 (21g ml!) and 
IL-2 (200U ml?) for 3 days. Akt inhibitor (MK-2206 2HCL, Selleckchem, 21M 
final concentration) or DMSO solvent control was added to the culture. rTyeg cell 
activation experiments: CD4*Foxp3+CD62L"'CD44"°CD69- ICOS~ 1Treg cells 
were sorted out from Foxp3“”, Foxp3“Foxo1CA/+ or Foxp3“*Foxo1CA/Foxo1CA 
mice, and labelled with CellTrace Violet (Invitrogen). Cells were cultured with 
plate-bound anti-CD3 (coated overnight, 51g ml”), soluble anti-CD28 (2;1gml"!) 
and IL-2 (200 U mI!) for 3 days. 
In vitro Treg cell suppression. CD4*CD25~ CD62L"CD44" conventional T cells 
purified by flow cytometry sorting were labelled with CellTrace Violet and used 
as responder cells. Responder T cells (5 x 10*) were cultured for 72h with irradi- 
ated splenocytes (1 x 10°) and anti-CD3 (2 bg ml!) in the presence or absence of 
various numbers of Treg cells. Teg cells (CD4* Foxp3-YFP*) were isolated by FACS 
sorting from Foxp3“", Foxp3“"Foxo1CA/+ or Foxp3“"Foxo1CA/Foxo1CA mice. 
Gene-expression profiling. Splenic rTreg (CD4* Foxp3- YFP* CD62L™CD44"") 
and aTyeg (CD4*Foxp3-YFP*CD62L °CD44») cells were isolated from 
Foxp3!?"" mice by FACS sorting. RNA was prepared with the miRNeasy Mini 
Kit according to the manufacturer's instructions (Qiagen). Complementary DNA 
(cDNA) libraries were amplified using the SMARTer RACE Amplification Kit 
(Clontech), and were sequenced in replicate using 50-base-pair paired-end reads 
at Genomics Core Laboratory of MSKCC. Ribosomal RNA reads were quantified 
and filtered using the short-read aligner, Bowtie v2.1.0 (ref. 34). The remaining 
reads were aligned to the mouse genome (mm10) using the STAR v2.3.1 short-read 
aligner*®. Additional quality control was performed using RSeQC v2.3.7 (ref. 37). 
Gene abundance was quantified by featureCounts** via the Subread analysis suite 
v1.4.3 (ref. 39). Differential gene expression was estimated using the DESeq2 R 
package with gene annotations curated in GENCODE version 2 for mouse refer- 
ence genome GRCm38 (Ensembl 74)”. 
Quantitative PCR. RNA extraction was performed using RNeasy columns 
(Qiagen) and cDNA was generated using QuantiTect Reverse Transcription 
Kit (Qiagen) according to the manufacturer’s instructions. RT? SYBR Green kit 
(Qiagen) and Stratagene Mx3500 (Agilent) were used in the qPCR experiments. 
mRNA levels of Lamc1, Nid2, Mmp9, Cd69, Egr2, Il1r2 and Actb were determined 
by the primers listed below. The mRNA amounts were normalized to those of Actb. 
Lamcl1: 5'‘-GGTGGTCTGTTTCAGCCATT-3’ and 5’-TGCCACAAAATCT 
CAGCTTG-3’; Nid2: 5'-TGGATATGGCCAAGGAGAAG-3’ and 5’-CACC 
GAGGACAGTTTCCATT-3/; Mmp9: 5'‘-ACCTTCCAGTAGGGGCAACT-3’ 
and 5‘-TGAATCAGCTGGCTTTTGTG-3’; Cd69: 5’- TCCGTGGACCACT 
TGAGAGT-3! and 5’-ATACTGGTGCCATGGTCCTT-3’; Egr2: 5'’-AGGCCGT 
AGACAAAATCCCA-3’ and 5’-TGATCATGCCATCTCCCGC-3’; Il1r2: 5'-TGG 
TGCACACAGGAAAGGTT-3’/ and 5’/-TGGAGATGTCGGAGTGAGGT-3’; 
Actb: 5'-TTGCTGACAGGATGCAGAAG-3’ and 5‘-ACATCTGCTGGAAG 
GTGGAC-3’. 
Immunofluorescence microscopy. Foxol localization experiment: spleens 
from wild-type mice were immediately disrupted using glass slides into Cytofix/ 
Cytoperm buffer (BD). After incubation for 30 min at room temperature, the cells 
were washed, resuspended in 90% methanol and incubated on ice for 30 min. 
After an additional wash, cells were stained for surface and intracellular antigens, 
including CD4, CD44 and Foxp3, for 45 min at room temperature in the dark. 
TT reg (CD4*Foxp3*+CD44"°) and aT reg (CD4*Foxp3 +CD44"') cells were then puri- 
fied by FACS sorting, and spun to glass slides using Cytospin centrifuge (Thermo 
Scientific) at 1,200 r.p.m. for 5 min. Cells on the glass slides were stained with 1:150 
diluted anti-Foxol (C29H4, Cell Signaling), followed by fluorophore-conjugated 
secondary antibody staining. Slides were mounted with gold anti-fading mount- 
ing buffer (Invitrogen). Images were acquired with a Leica TCS SP5-II confocal 
microscope at Molecular Cytology Core of MSKCC. For quantitative analysis, five 
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fields were selected randomly and total cells in the field were manually counted 
and grouped with Volocity software (PerkinElmer Inc.), on the basis of their Foxol 
nuclear or cytosolic localization. 

For HA-Foxol staining, CD4*Foxp3-YFP* cells were purified from Foxp3“, 

Foxp3\Foxo1CA/+ or Foxp3“Foxo1CA/Foxo1CA mice by FACS sorting, and 
were spun to glass slides using Cytospin centrifuge (Thermo Scientific) at 500 r.p.m. 
for 1 min. After fixation with 4% paraformaldehyde, cells were permeabilized with 
Foxp3 fixation/permeabilization buffer (eBioscience) according to the manufac- 
turer’s instructions. Cells were then blocked with permeabilization buffer and 3% 
BSA, and incubated with 1:500 diluted anti-HA (C29F4, Cell Signaling) and 1:150 
diluted anti-Foxp3 (FJK-16s, eBioscience), followed by fluorophore-conjugated 
secondary antibodies in permeabilization buffer and 1% BSA. 
Immunoblotting. T,. cells (CD4*Foxp3-YFP*) were isolated from Foxp3", 
Foxp3“"Foxo1CA/+ or Foxp3“Foxo1CA/Foxo1CA mice by FACS sorting. Total 
protein extracts were dissolved in SDS sample buffer, separated on 12% SDS- 
PAGE gels and transferred to polyvinylidene difluoride membrane (Millipore). 
The membranes were probed with anti-HA (6E2, Cell Signaling) and anti-6-actin 
(AC-15, Sigma), and visualized with the Immobilon Western Chemiluminescent 
HRP Substrate (Millipore). 
Histopathology. Liver and colon tissues were fixed in Safefix II (Protocol) and 
embedded in paraffin. Five-millimetre sections were stained with haematoxylin 
and eosin. The following grades were used to evaluate the colon pathology: 0, nor- 
mal colonic crypt architecture; 1, mild inflammation: slight epithelial cell hyper- 
plasia and increased numbers of leukocytes in the mucosa; 2, moderate colitis: 
pronounced epithelial hyperplasia, marked leukocyte infiltration, and decreased 
numbers of goblet cells; 3, severe colitis: marked epithelial hyperplasia with exten- 
sive leukocyte infiltration, substantial depletion of goblet cells, occasional ulcera- 
tion, or cryptic abscesses; 4, very severe colitis: marked epithelial hyperplasia with 
extensive transmural leukocyte infiltration, severe depletion of goblet cells, many 
crypt abscesses and severe ulceration. 


Serum ALT. Blood was collected immediately after mice were euthanized and was 
stored at room temperature for 1h. The samples were then centrifuged for 15 min 
at 1,800g, and the supernatant was obtained as serum. ALT activity was determined 
according to manufacturer's instructions (Sigma-Aldrich), using SpectraMax M5 
plate reader (Molecular Devices). 

Statistical analysis. All data are presented as the mean + s.e.m. Comparisons 
between groups were analysed using unpaired or paired Student’s t-tests or 
ANOVA, as appropriate. *P < 0.05; **P < 0.01; ***P< 0.001; ****P < 0.0001. 
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Extended Data Figure 1 | Parabiotic analysis of T;eg cells from different 
organs. a, Flow cytometric analysis of CD44 and CD62L in Treg cells 
(CD4*Foxp3*) from lymph node, spleen, liver and colon lamina propria 
of C57BL/6 mice. Percentages of rTyeg and aTyeg cells are shown. A sizable 
population of intermediate phenotype (CD62L™CD44") Tyeg cells was 
also present in the liver. b, Graphical representation of the parabiosis 
experiments. Congenically mismatched C57BL/6 mice were surgically 


connected (time point —4). Parabionts were analysed 2 or 4 weeks after 
surgery (time point —2 or 0). In separation experiments, parabiotic mice 
that had been connected for 4 weeks were surgically disconnected from 
each other (time point 0). Separated mice were analysed 2, 6, 12 or 

18 weeks after surgery (time point 2, 6, 12 or 18). c, Representative flow 
cytometric plots showing chimaerism of naive CD4* T cells or different 
Treg Cell subsets in a CD45.2/2* parabiont at various time points. 
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Extended Data Figure 2 | aTyeg cell differentiation is associated with 
Akt-triggered suppression of Foxol. a, Gene expression comparison 
of Foxol-regulated genes in splenic aTyeg versus rTyeg cells. Foxol- 
regulated genes were defined by the following criteria: (1) differentially 
expressed between wild-type and Foxol-knockout Tyeg cells; (2) the 
expression was corrected by expression of a constitutively active mutant 
of Foxol (Foxo1CA). b, Normalized expression of all transcripts isolated 
from splenic aT; eg cells were plotted against transcripts from splenic 
rTieg cells. Some of the Foxol direct target genes were highlighted. 

c, Comparison of Lamcl, Nid2, Mmp9 mRNA levels in rT; eg versus aT eg 
subsets. d, Flow cytometric analysis of phosphorylated S6 ribosomal 


protein in splenic aTyeg and rTreg cells. Quantification shows mean 
fluorescence intensity normalized to rTreg cells. n = 3. *P < 0.05 (paired 
t-test). e, aTyeg cell differentiation experiments in vitro. rTreg cells 
(CD4*Foxp3*CD62L"CD44"°) from spleen and lymph nodes of Foxp3- 
YFP reporter mice were purified by flow cytometric sorting, and were 
stimulated with anti-CD3/CD28 and IL-2 for 3 days. Akt inhibitor 

(Akti, MK-2206) or DMSO solvent control were added to the culture. 
Phosphorylated Foxol, and CD44 and CD62L levels were determined by 
flow cytometry. Quantification shows percentages of rT,eg and aTyeg cells in 
DMSO control or Akt inhibitor group. n =3. *P < 0.05 (unpaired t-test). 
Data are mean +s.e.m. 
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Extended Data Figure 3 | Foxol hyperactivation in T;eg cells does analysis of ICOS and CD62L expression in lymph node Treg cells from 
not affect expression of activation markers or proliferation. 9-12-day-old wild-type, CA/+ and CA/CA mice. Bar graph shows 
a, b, Treg cells (CD4*Foxp3*) from spleens and lymph nodes of wild-type, fractions of ICOS*CD62L™, ICOS*CD62L" subsets among Treg cells. 
Foxp3“"Foxo1 CA/+ (CA/+) or Foxp3“*Foxol CA/Foxo1CA (CA/CA) n=4.e, CD62LhiCD44"° rTreg Cells from wild-type or CA/CA mice, or 
mice were purified by flow cytometric sorting. a, Haemagglutinin CD62L"°CD44" aTyeg cells from wild-type mice were isolated by flow 
(HA)-tagged Foxo1CA protein in the whole-cell lysate was measured by cytometric sorting, and the expression of Cd69, Egr2 and Il1r2 mRNA 
immunoblotting with anti-HA. b, Subcellular localization of HA-FoxolCA was determined by reverse transcriptase quantitative PCR (RT-qPCR). 
was determined by immunofluorescence staining with anti-HA. f, EdU was administrated intraperitoneally into 9-10-day-old wild-type or 


c, Flow cytometric analysis of Foxol-activated target genes CCR7 and Bim = CA/CA mice. EdU incorporation in lymph node rTyeg and aTreg cells was 
in lymph node T,.g cells from 9-12-day-old wild-type, CA/+ and CA/CA analysed 18 h after injection. n = 3. **P <0.01 (unpaired t-test). Data are 
mice. Grey shaded lines represent isotype controls. d, Flow cytometric mean +s.e.m. 
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Extended Data Figure 4 | CA-expressing T,<g cells show intact activation, Grey shaded lines represent isotype controls. b, Cells were stained with 
proliferation and survival in vitro. CD4+Foxp3*CD62L™CD44”° CellTrace Violet dye before culture, and cell division was tracked by 
CD69-ICOS~ rTyeg cells were isolated from wild-type, Foxp3“’Foxo1CA/+ dilution of the dye. Grey shaded line represents undivided cells. c, Cell 
(CA/+) or Foxp3“"Foxo1CA/Foxo1CA (CA/CA) mice, and were death was measured by propidium iodide (PI) incorporation. Percentages 
stimulated with anti-CD3, anti-CD28 and IL-2 for 3 days. a, CD44, of live cell fraction (PI-) were compared. n = 4. NS, not significant 
CD62L, CD69 and ICOS expression was determined by flow cytometry. (unpaired t-test). Data are mean + s.e.m. 
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Extended Data Figure 5 | Foxol hyperactivation depletes aT, ¢g cells 

in mice. a, In vivo labelling experiment: anti-CD4-biotin (RM4-4 clone) 
was injected intravenously into wild-type, Foxp3\’Foxo1CA/+ (CA/+) 
or Foxp3\°Foxo1CA/Foxo1 CA (CA/CA) mice, and spleens were collected 
5 min after injection. A non-competing clone of anti-CD4 (RM4-5) was 
used in subsequent flow cytometric analysis to label total CD4* T cells. 
Representative flow cytometric plots from gated conventional CD4* 


T cells (CD4 RM4-5*Foxp3) are shown. Quantification shows percentage 
of RM4-4* among conventional CD4? T cells. 7 =10 (WT, CA/CA); n=7 
(CA/+). b, c, The frequencies of lymph node rTyeg or aTyeg cells among 
CD4* T cells of 9-12-day-old wild-type, CA/+ or CA/CA mice. n= 4-6. 
c, The numbers of lymph node rTyeg or aTyeg cells of 9-12-day-old wild- 
type, CA/+ or CA/CA mice. n= 6 (WT, CA/CA); n=4 (CA/+). Unpaired 
t-test. Data are mean +s.e.m. 
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Extended Data Figure 6 | Foxol hyperactivation preferentially impairs 
aTreg cells, but does not affect T,.g cell suppressive function in vitro. 

a, The frequencies of lymph node and splenic rTy<g or aTyeg cells among 
CD4* T cells of 4-6-week-old wild-type, Foxp3°"’Foxo1CA/+ (CA/+) or 
Foxp3\Foxo1CA/Foxo1CA (CA/CA) mice. n= 10. b, The numbers of 
lymph node rTreg or aTyeg cells of 4-6-week-old wild-type, CA/+ or 
CA/CA mice. n= 10. ¢, A representative picture of 5-week-old wild-type, 
CA/+ or CA/CA mice. d, Body weight of 4-6-week-old wild-type, CA/+ 
and CA/CA mice. n=5.e, The numbers of CD4* and CD8* T cells from 
lymph node and spleen of 4-6-week-old wild-type, CA/+ and CA/CA 
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mice. n=7 (WT, CA/+); n= 14 (CA/CA). f, Flow cytometric analysis of 
CTLA4, LAG3 and GITR expression in lymph node rTyeg and aTyeg cells 
from 4—-6-week-old wild-type, CA/+ and CA/CA mice. Grey shaded lines 
represent isotype controls. g, Suppression of wild-type naive CD4* T cells, 
labelled with CellTrace Violet (Tyesp, responding T cells), by wild-type, 
CA/+ or CA/CA Treg cells. Tyesp cell division was assessed by CellTrace 
Violet dilution at the indicated ratios of cell numbers between T;eg and 
Tresp cells. Grey line represents conditions without Treg cell in culture. 
*P< 0.05; **P< 0.01; ***P< 0.001; ****P < 0.0001 (unpaired t-test). 
Data are mean +s.e.m. 
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Extended Data Figure 7 | Modest increase of inflammatory cytokine expression by CD4*Foxp3~ conventional T cells from lymph node and 
production by CD4* and CD8* T cells in Foxp3“'*Foxo1CA/Foxo1CA spleen of 4—6-week-old wild-type, CA/+ and CA/CA mice. n= 14 (WT, 
mice. a, d, Flow cytometric analysis of CD44 and CD62L expression in CA/CA for IFN-7), 1=7 (WT, CA/CA for IL-4, IL-17); n =3 (CA/+). 
CD4*Foxp3~ conventional T cells (a) or CD8* T cells (d) from lymph e, f, Representative histogram (e), and quantification (f) of IFN-7 
node and spleen of 4-6-week-old wild-type, Foxp3“’Foxo1CA/+ (CA/+) expression by CD8* T cells from lymph node and spleen of 4-6-week-old 
or Foxp3\"’Foxo1CA/Foxo1CA (CA/CA) mice. b, c, Representative wild-type, CA/+ and CA/CA mice. n=9 (WT); n=6 (CA/+);n=11 
histogram (b), and quantification (c) of cytokine (IFN-y, IL-4, IL-17) (CA/CA). *P < 0.05; **P < 0.01 (unpaired t-test). Data are mean + s.e.m. 
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Extended Data Figure 8 | Characterization of CD8* T cells from (CA/+); n=8 (CA/CA). ¢, CD8* to CD4* T-cell ratio among lymph node, 
non-lymphoid tissues. a, b, Flow cytometric analysis (a), and spleen, liver and lamina propria T cells of 4-6-week-old wild-type, CA/+ 
quantification (b) of IFN-y expression by CD8* T cells from liver and and CA/CA mice. n= 10 (WT); n=8 (CA/+); n= 13 (CA/CA). *P < 0.05 
colon lamina propria of 4—6-week-old wild-type, Foxp3©Foxo1CA/+ (unpaired t-test). Data are mean + s.e.m. 
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Extended Data Figure 9 | CD8* T-cell depletion rescues the lethal Foxp3"Foxo1CA/Foxo1CACD8~/~ mice. b, Haematoxylin and eosin 
disease in Foxp3“’Foxo1CA/Foxo1CA mice. a, A representative image staining of liver and colon sections from 5-week-old wild-type, CD8~/~, 
of 5-week-old wild-type, CD8~/~, Foxp3"*Foxo1CA/Foxo1CA and Foxp3“"Foxo1CA/Foxo1CA and Foxp3“’Foxo1CA/Foxo1CACD8 ~~ mice. 
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Extended Data Figure 10 | Foxp3“Foxo1CA/+ mice show enhanced 
anti-tumour immune responses. a, Flow cytometric analysis of CD44 
and CD62L in Treg cells (CD4*Foxp3*) from lymph node and tumour 
of PyMT tumour-bearing mice. b-d, Eight-to-ten-week-old wild-type 
or Foxp3“"’Foxo1CA/+ (CA/+) mice received orthotopic inoculation of 
PyMT-derived mammary tumour cells (AT-3). b, Survival of wild-type 
and CA/+ mice received tumour implantation. c, Foxp3 expression in 
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tumour-infiltrating CD4* T cells from day 30-35 AT-3 tumour. d, Flow 
cytometric analysis of GzmB expression in CD8* T cells from wild-type 
or CA/+ tumour-bearing mice. e, Tumour growth curve of wild-type, 
CD8~!~, CA/+, or Foxp3“"Foxo1CA/+ CD8~/~ (CA/+CD8~'-) mice that 
received orthotopic inoculation of AT-3 tumour cells. n= 4. *P < 0.05; 
**P < 0.01; ****P < 0.0001; log-rank test (b) and two-way analysis of 
variance (ANOVA) (e). Data are mean +s.e.m. 
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A mechanism of viral immune evasion revealed by 
cryo-EM analysis of the TAP transporter 


Michael L. Oldham!?, Richard K. Hiteb?, Alanna M. Steffen*, Ermelinda Damko!, Zongli Li?’*, Thomas Walz! & Jue Chen? 


Cellular immunity against viral infection and tumour cells depends 
on antigen presentation by major histocompatibility complex class I 
(MHC I) molecules. Intracellular antigenic peptides are transported 
into the endoplasmic reticulum by the transporter associated with 
antigen processing (TAP) and then loaded onto the nascent MHC I 
molecules, which are exported to the cell surface and present peptides 
to the immune system’. Cytotoxic T lymphocytes recognize non-self 
peptides and program the infected or malignant cells for apoptosis. 
Defects in TAP account for immunodeficiency and tumour 
development. To escape immune surveillance, some viruses have 
evolved strategies either to downregulate TAP expression or directly 
inhibit TAP activity. So far, neither the architecture of TAP nor the 
mechanism of viral inhibition has been elucidated at the structural 
level. Here we describe the cryo-electron microscopy structure of 
human TAP in complex with its inhibitor ICP47, a small protein 
produced by the herpes simplex virus I. Here we show that the 12 
transmembrane helices and 2 cytosolic nucleotide-binding domains 
of the transporter adopt an inward-facing conformation with the 
two nucleotide-binding domains separated. The viral inhibitor 
ICP47 forms a long helical hairpin, which plugs the translocation 
pathway of TAP from the cytoplasmic side. Association of ICP47 
precludes substrate binding and prevents nucleotide-binding 
domain closure necessary for ATP hydrolysis. This work illustrates 
a striking example of immune evasion by persistent viruses. By 
blocking viral antigens from entering the endoplasmic reticulum, 
herpes simplex virus is hidden from cytotoxic T lymphocytes, which 
may contribute to establishing a lifelong infection in the host. 


Inside our body, every nucleated cell has surface ‘barcodes’ that are 
surveyed by the immune system. These barcodes are peptides derived 
from intracellular proteins, presented on the surface by MHC I mole- 
cules to indicate whether the cell is healthy (reviewed in ref. 1). Peptides 
generated from normal cellular proteins are ignored by cytotoxic T cells, 
whereas viral-derived or malignant peptides will trigger an adaptive 
immune response, resulting in elimination of the infected or tumour 
cells. The peptide repertoire is generated in the cytoplasm, mainly 
by the proteasome, but also in part by cytosolic peptidases (Fig. 1a). 
Peptide uploading onto MHC I molecules takes place inside the endo- 
plasmic reticulum (ER) and is orchestrated by a macromolecular 
assembly collectively called the MHC class I peptide-loading complex 
(PLC). Cytosolic peptides are delivered across the ER membrane by the 
ATP-binding cassette (ABC) transporter TAP. The chaperones calnexin 
and calreticulin stabilize nascent MHC I molecules awaiting peptides. 
The tapasin/ERp57 heterodimer brings MHCI molecules and TAP 
within close proximity and catalyses peptide loading. Peptide-loaded 
MHCI molecules are then released from the ER and transported to the 
cell surface for antigen presentation. 

As the MHC I antigen presentation pathway plays a crucial role 
in eradicating intracellular pathogens, it is not surprising that 
some viruses have evolved the ability to interfere with this process 
(reviewed in ref. 2). The peptide transporter TAP in particular is a 
primary target for viral evasion (reviewed in ref. 3). TAP is a het- 
erodimeric ABC transporter that contains two subunits, TAP1 and 
TAP2, which share 37% sequence identity and are predicted to have 
similar structures. Each subunit contains an amino (N)-terminal 
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Figure 1 | Purification and cryo-EM characterization of TAP. a, The 
MHC class I antigen presentation pathway. PLC: peptide-loading complex. 
b, Topology diagram of TAP1 (blue) and TAP2 (gold). The residue 
numbers of the C termini are indicated. c, Gel-filtration profile of the 
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TAP-ICP47 complex. Inset: SDS-polyacrylamide gel electrophoresis of the 
peak fraction stained with Coomassie blue. d, A typical micrograph of the 
TAP-ICP47 complex after drift correction. Also shown are representative 
two-dimensional class averages of the particles. 
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Figure 2 | Three-dimensional reconstruction. a, Stereo view of the 
overall density map, filtered to 6.5 A. The a-carbon traces of the TAP core 
and the N-terminal 50 residues of ICP47 are also shown. b, Two views of 
the overall density map, coloured by protein subunit. c, Ribbon diagram of 
the poly-alanine model presented in two orthogonal views. The TM helices 
in the core of the transporter are labelled according to Fig. 1b. TAP1, blue; 
TAP2, gold; ICP47, magenta. 


transmembrane region (TMD0) that interacts with tapasin, fol- 
lowed by six transmembrane (TM) helices that form the peptide 
translocation pathway and a canonical nucleotide-binding domain 
(NBD) that hydrolyses ATP (Fig. 1b)*. The core TAP, devoid of the 
TMD0s, is necessary and sufficient for peptide transport*. So far, 
five viral proteins have been identified as TAP inhibitors. Four 
are encoded by members of the herpes virus family and one by 
cowpox virus’. These viral inhibitors are valuable tools for selec- 
tive immune suppression and for understanding the fundamental 
mechanism of antigen presentation. 
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Figure 3 | The viral inhibitor ICP47 plugs into the transmembrane 
pathway. a, EM density corresponding to ICP47 (black mesh), viewed 
from the plane of the membrane. Terminal residues in the ICP47 model 
are indicated. b, Binding of ICP47 to TAP, viewed along the membrane 
normal from the cytoplasm. c, The helix-loop-helix structure of ICP47 is 
conserved between HSV-1 and HSV-2. Conserved residues are highlighted 
in yellow. A red box highlights residues replaced by alanine in the ‘turn- 
to-helix’ (TtH) mutant. d, FACS analysis (three repeats) of MHC I surface 
expression in cells expressing wild-type (WT) and mutant ICP47. 


Here we focus our study on a TAP inhibitor encoded by herpes 
simplex virus (HSV). Both types of HSV, HSV-1 (oral herpes) and 
HSV-2 (genital herpes), somehow elude the human immune system 
and lead to a lifelong infection. The first clue as to how HSV bypasses 
the immune system came from observations that cells infected by HSV 
have reduced surface expression of MHC I molecules® and are resistant 
to cytotoxic T cells®. Since this resistance develops within 3 hours of 
HSV infection, researchers narrowed their search for the responsible 
gene to those few expressed in the early stage of infection”*®. Out of 
these, an 88-residue protein, ICP47, was found to bind to TAP and pre- 
vent peptide translocation into the ER”*®. Consequently, empty MHC I 
molecules were retained in the ER and viral peptide presentation was 
suppressed. Subsequent studies have shown that ICP47 interacts with 
TAP from the cytosolic side of the membrane and somehow prevents 
peptide binding”"”. The functional domain of ICP47 has been mapped 
to the N-terminal 35 residues'!”, which form an extended helix-loop- 
helix structure in lipid bilayers!°. 

In this study, we pursued structural determination of a TAP-ICP47 
complex using cryo-electron microscopy (cryo-EM). The small size 
of the complex (166 kDa total) and the predicted pseudo-twofold 
symmetry between TAP1 and TAP2 make it extremely challenging 
to accurately align particles for three-dimensional reconstruction. 
To maximize the difference between the two TAP subunits, we used 
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Figure 4 | ICP47 precludes peptide binding and traps TAP in an inward- 
facing conformation. TAP functions via alternating access, cycling 
between two major conformations (right). In the absence of substrates, 

the transporter rests in an inward-facing state in which the two NBDs 

are separated and the translocation pathway is exposed to the cytosol. 


a shorter allele of TAP2 that lacks the last 17 amino acids in NBD2 
(Fig. 1b)'*. Co-expression of TAP1 and TAP2 in Pichia pastoris 
(Extended Data Fig. 1) produced a heterodimer that dissociates readily 
in detergents. However, by incubating the TAP-enriched membranes 
with ICP47 before detergent solubilization, the complex consisting of 
TAP1, TAP2, and ICP47 becomes more stable (Fig. 1c, d). Cryo-EM 
analysis of this complex (Extended Data Fig. 2 and Fig. 1d) produced 
a density map (Fig. 2) of an overall resolution of 6.5 A, determined by 
the gold-standard refinement procedure (Extended Data Fig. 3)'°. In 
this reconstruction the TM helices and the connectivity between the 
helices are clearly resolved. The density corresponding to one NBD is 
significantly smaller than that for the other, allowing us to confidently 
differentiate TAP1 from TAP2 (Fig. 2b). Most importantly, we observe 
strong density corresponding to the functional region of ICP47, which 
reveals how this viral protein inhibits peptide translocation (Fig. 2b). 

The core region of TAP adopts an inverted ‘V’-shaped structure, with 
the two TMDs making close contact on the side corresponding to the 
ER lumen and the NBDs separated from each other inside the cytosol 
(Fig. 2c). Domain swapping of TM helices 4 and 5 across the TAP1/ 
TAP2 interface is a prominent structural feature (Fig. 2c). Indeed, the 
overall structure of TAP is very similar to that of other ABC exporters, 
including the lipid flippase MsbA from Gram-negative bacteria’®, the 
protein transporter PCAT1 from Gram-positive bacteria'’, and the 
multidrug transporter P-glycoprotein in eukaryotes'*’°. Although 
these transporters recognize very different substrates, they must share 
a common evolutionary origin and a common mechanism for coupling 
ATP hydrolysis to substrate translocation. 

No density was observed for the N-terminal TMD0 domain of both 
TAPI and TAP2 subunits (Fig. 2). Studies have shown that the TMD0s 
are essential to the assembly of the large peptide-loading complex*”° 
but dispensable in peptide translocation*. ICP47 inhibits both full- 
length TAP and the core construct*. Our results indicate that in the 
absence of tapasin the two TMD0s are flexibly tethered to the core 
region of TAP. 

Biochemical data and homology modelling suggest that the peptide 
translocation pathway lies at the interface of the two TMDs”!*, Inside 
this pathway, we observe strong density consistent with the helix- 
loop-helix structure of ICP47 (Figs 2b and 3a). Guided by the NMR 
structure and secondary structure prediction}, we built residues 3-16 
into the shorter helical density and residues 22-40 into the longer density 
(Fig. 3). Additional density is packed along the cytosolic region 
of TAP2, into which we modelled residues 41-50. The carboxy 
(C)-terminal region, neither required for TAP inhibition nor conserved 
between HSV-1 and HSV-2 (Fig. 3c), is not resolved in the EM map, 
suggesting high mobility. 

On the basis of this model, the N-terminal half of ICP47 forms a 
hairpin-like structure pinned against the inner surface of TAP2 TM 
helices 2, 3, 6 and TAP1 TM helix 4 (Fig. 3b). The two helices of ICP47 
run anti-parallel to each other, connected by a sharp turn at the top 
of the TM cavity. The extensive packing between TAP and ICP47 is 
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Upon association of substrates and ATP, the transporter undergoes a 
conformational change that reorients the TMDs and positions ATP at 

a closed NBD dimer interface for hydrolysis. ATP hydrolysis releases the 
substrate and resets the transporter to the resting state. ICP47 binds TAP 
stabilizing the inward-facing conformation (left). 


consistent with the nanomolar affinity of ICP47, orders of magnitude 
higher than those of the substrate peptides” !”. 

The potency of ICP47 appears to come from its helical hairpin 
structure, which provides a greater interface with TAP than a typi- 
cal substrate. To test whether the flexibility of the connecting loop is 
important, we constructed a ‘turn-to-helix’ mutant by replacing resi- 
dues 16-22 with alanine, which has the highest propensity to form an 
a-helix and thus would oppose (but not preclude) the formation of a 
turn. ICP47 activity was measured in human epithelial cells, in which 
cytosolic expression of ICP47 inhibits endogenous TAP and reduces 
the amount of MHC I molecules expressed on the cell surface. A green 
fluorescent protein (GFP) tag was fused to the C terminus of ICP47 
as a marker to select for cells expressing similar amounts of ICP47. 
Consistent with the EM structure and previous mutagenesis data!', the 
‘turn-to-helix’ mutant is much less potent than the wild-type construct. 
Specifically, mutant ICP47 reduced surface MHC I expression by only 
fivefold as opposed to a 20-fold reduction by wild-type ICP47 (Fig. 3d). 

Although ICP47 competes for the same binding site, we do not 
believe it mimics the substrate binding process. Unlike substrates, 
ICP47 inhibits rather than stimulates ATP hydrolysis”’. Furthermore, 
in contrast to ICP47, which separates the two NBDs, substrate binding 
induces partial closure of the NBDs”*. More recently, electron para- 
magnetic resonance studies showed that TAP binds its substrates in 
their extended conformations, comparable to how MHC I molecules 
present peptides”. 

Comparison of our structure with the NMR structure of ICP47 
(ref. 13) suggests that it undergoes major conformational changes 
upon association with TAP (Fig. 4). In isolation, the N-terminal two 
helices of ICP47 are flexibly linked and bind to the surface of the 
membrane at a slight tilt!®. In the complex with TAP, ICP47 forms 
a straight hairpin and inserts perpendicularly into the membrane. 
Although exact determinations of the amino acid register cannot be 
made at the current resolution, the overall structure readily explains 
how ICP47 inhibits peptide transport into the ER. By plugging a 
long helical hairpin into the translocation pathway, ICP47 directly 
blocks substrates from binding. Furthermore, because ICP47 is too 
large to be transported by TAP, its high-affinity binding traps TAP in 
an inactive conformation. Like other ABC transporters, TAP func- 
tions by alternating between two major conformations, each expos- 
ing the translocation pathway to one side of the membrane (Fig. 4). 
Binding of ICP47 stabilizes the inward-facing conformation, and thus 
prevents TAP from transitioning to an outward-facing state in which 
the NBDs form a closed dimer and the translocation pathway orientates 
towards the ER lumen (Fig. 4). 

In addition to viral inhibition discussed in this study, the structural 
basis of two cellular regulatory mechanisms has been elucidated****; 
both are relevant to nutrient uptake in bacteria. As a classic exam- 
ple of carbon catabolite repression, when a preferred carbon source is 
available, bacteria suppress the uptake of maltose through direct bind- 
ing of a regulatory protein to the maltose transporter*®. Methionine 
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and molybdate transporters offer another example””*. In both cases, 
at high intracellular concentration, the substrate binds and inhibits 
the corresponding transporter, a feedback mechanism that limits the 
amount of import into the cell?”?8. Unlike viral inhibition, both cel- 
lular inhibitions are allosteric and reversible, regulated by the meta- 
bolic state of the cells. One common theme among all these inhibition 
mechanisms is that the inhibitor binds and stabilizes the transporter in 
the inward-facing state, a conformation unable to hydrolyse ATP. We 
speculate that this strategy may be advantageous in preserving cellular 
energy sources. It is also possible that for most ABC transporters, the 
inward-facing state is most common and thus naturally targeted by 
regulators. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 


No statistical methods were used to predetermine sample size. The experiments 
were not randomized. The investigators were not blinded to allocation during 
experiments and outcome assessment. 

Expression and purification of ICP47. The gene encoding human herpes 
simplex virus 1 (HSV-1) ICP47 was synthesized (BioBasic) and subcloned 
into the ligation-independent cloning vector pMCSG20 (Amp®), which con- 
tained an N-terminal glutathione S-transferase (GST) affinity tag preceding 
a tobacco etch virus (TEV) protease cleavage site. BL21(DE3)RIL Escherichia 
coli cells containing the pMCSG20-ICP47 vector were grown to mid-expo- 
nential phase at 37°C in Lysogeny broth (LB) medium and expression was 
induced with 200 1M isopropyl 3-p-thiogalactoside (IPTG) at 20°C for 24h. 
The cells were harvested via centrifugation (4,000g for 12 min at 4°C) and 
broken by two passes through a high-pressure homogenizer (EmulsiFlex-C3; 
Avestin). Cell lysate supernatants were loaded onto glutathione sepharose 
4B resin (GE Healthcare) equilibrated with PBS buffer, pH 7.4, containing 
5mM p,L-dithiothreitol (DTT). To remove the GST tag from ICP47, the column 
was equilibrated into a TEV protease cleavage buffer containing 50 mM Tris- 
HCl, pH 8.0, 200 mM NaCl, 0.5 mM EDTA, and 5mM DTT and incubated with 
TEV protease overnight. Untagged ICP47 was eluted and further purified by 
Superdex 75 gel-filtration chromatography (GE Healthcare) in TEV-cleavage 
buffer. Fractions containing the protein were pooled, concentrated to 3 mg mit, 
and flash-frozen in liquid nitrogen. 

Co-expression of TAP1/TAP2. Synthetic human TAP1 and TAP2 genes were 
codon-optimized for expression in P. pastoris (BioBasic) and subcloned into 
pPICZ-C-XE-protein A (Zeo®) and pPICZ-C-XE vectors (Zeo®), respectively. To 
ensure co-expression at a 1:1 molar ratio, we took advantage of the compatible 
cohesive ends of BamHI and BglII restriction sites to generate an expression cas- 
sette as outlined in Extended Data Fig. 1. The resultant vector was linearized by 
Pmel digestion and transformed into a HIS* strain of SMD1163 by electroporation 
(BioRad Gene Pulser II). Transformants were selected on yeast extract peptone 
dextrose sorbitol (YPDS) agar containing 800 1g mI! zeocin. Colonies were grown 
in yeast extract peptone dextrose (YPD) cultures at 28°C until they reached an 
absorbance at 600 nm, A¢0oo nm: Of 4 to seed flasks containing minimal glycerol 
medium (MGY), 13.4% yeast nitrogen base, and 1% glycerol at a starting Agoo nm 
of 0.5. The MGY cultures were grown at 28°C for 24h until they reached an Agoo nm 
of 20, at which point they were harvested by centrifugation (1,500g for 15 min 
at 4°C) and used to seed flasks containing minimal methanol medium (MMY), 
13.4% yeast nitrogen base, and 0.5% methanol at a starting Agoo nm of 10. These 
MMY cultures were grown at 28°C for 24 h before harvesting (1,500g for 15 min at 
4°C). Cell pellets were fractionated and flash-frozen in liquid nitrogen. 
Purification of the TAP-ICP47 complex. Cells expressing TAP1-protein 
A/TAP2 were lysed using a mixer mill (Retsch Mixer Mill 400) and incubated 
with purified ICP47 in a buffer containing 50 mM Tris-HCl, pH 8, 500 mM NaCl, 
15% glycerol, DNase I, protease inhibitors, and 2mM TCEP for 30 min. Cells were 
then solubilized with 1.5% n-dodecyl-$-p-maltoside (DDM; Anatrace) for 2 h. The 
solubilized fraction was isolated by centrifugation (70,000g for 40 min at 4°C). The 
TAP1-protein A/TAP2/ICP47 complex was isolated using the protein A affinity tag 
on TAP] by lgG Sepharose 6 Fast Flow (GE Healthcare). After extensive washing, 
PreScission protease (GE Healthcare) was added to the column and incubated 
overnight to remove the protein A tag. The complex was eluted with additional 
buffer and further purified using a Superose 6 column (GE Healthcare) equil- 
ibrated with 20mM HEPES, pH 7.4, 150 mM NaCl, 2mM TCEP, 1mM DDM, 
and 1 mM octaethylene glycol monododecyl ether (C12E8; Anatrace). The peak 
fraction was used to prepare cryo-EM grids. 

Initial cryo-EM imaging and generation of an initial model. Vitrified spec- 
imens of TAP-ICP47 complex were prepared on glow-discharged Quantifoil 
holey carbon grids by plunge-freezing into liquid ethane using a Vitrobot (FEI). 
Cryo-EM data were collected at liquid-nitrogen temperature using a K2 Summit 
direct electron detector camera (Gatan) on a Tecnai F20 electron microscope 
(FEI) operating at 200 keV. Dose-fractionated image stacks were recorded with 
UCSF Image 4 (ref. 29) in super-resolution counting mode at a calibrated mag- 
nification of x 40,410 (nominal magnification of x29,000) with a dose rate of 
8 electrons per pixel per second (5.2 electrons per square angstrom per second). 
Frames were read out every 200 ms and 30 frames were collected, resulting in 
an exposure time of 6 s and a total dose of 31.2 electrons per square angstrom. 
Dose-fractionated image stacks were twice binned and motion-corrected, as 
described”’. The defocus was determined with CTER*’. BOXER was used to 
interactively pick 28,813 particles from ~750 images*!. The particle images 
were subjected to the iterative stable alignment and clustering (ISAC) proce- 
dure* implemented in SPARX*?. Four ISAC generations specifying 100 particles 
per group and a pixel error threshold of 0.7 yielded 324 averages. Of these, 
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270 averages were used to calculate an initial three-dimensional density map 
with the validation of individual parameter reproducibility (VIPER) procedure 
in SPARX. 

EM sample preparation and imaging for the final three-dimensional recon- 
struction. Cryo-EM grids were prepared by pipetting 3 11 freshly purified 
TAP-ICP47 (2mg ml") onto glow-discharged C-flat holey carbon 
CF-1.2/1.3-4C grids (Protochips) and letting the sample adsorb for 20s. The 
grids were blotted for 4 s at 90% humidity using a Vitrobot Mark IV (FEI) 
and immediately plunge-frozen in liquid-nitrogen-cooled liquid ethane. The 
grids were imaged using a FEI Titan Krios electron microscope operating at 
an acceleration voltage of 300 keV. Images were recorded using a K2 Summit 
direct electron detector (Gatan) set to super-resolution counting mode with a 
super-resolution pixel size of 0.675 A using the program SerialEM™. In addi- 
tion, a Gatan Imaging filter with a slit width of 20 eV was used to remove inelas- 
tically scattered electrons. Movie frames were recorded with an exposure time 
of 200 ms using a dose rate of 10 electrons per pixel per second or 5.5 electrons 
per square angstrém per second (1.35 A at the image plane). Three data sets 
were recorded using different total doses and defocus ranges (Extended Data 
Table 1 and Extended Data Fig. 2). 

Image processing. Movie frames were corrected for gain reference and binned by 
a factor of 2, giving a pixel size of 1.35 A. Drift correction was performed using 
the program Unblur®***. Next, the drift-corrected frames were summed into 
single micrographs, which were used to estimate the contrast transfer function 
(CTF) using CTFFIND4*. The program Summovie was used to recalculate the 
summed images, first with a low-pass filter for autopicking and then with the 
noise power restored after filtering for particle extraction**. Autopicking, particle 
extraction, two-dimensional classification, three-dimensional classification, and 
initial three-dimensional refining were all performed in Relion®’. To achieve a 
more robust classification of the extracted particles, two-dimensional classification 
was performed with a particle mask diameter of 145 A while ignoring the effects of 
the CTF until the first zero transition. Three-dimensional classification was per- 
formed on particles from selected two-dimensional classes using the initial model 
calculated in SPARX as a reference map. Particles from selected three-dimensional 
classes from both data sets were combined for three-dimensional refinement. Using 
the orientation parameters determined by Relion, three-dimensional refinement 
in FREALIGN was also performed“. The final map, reconstructed from 139,293 
particles, had a resolution of 6.5 A as determined by Fourier shell correlation (FSC) 
of independently refined half-data sets using the 0.143 cut-off criterion (Extended 
Data Fig. 3). 

Model building. We used a model of the TAP1 NBD from a previously reported 
structure of the isolated domain (PDB accession number 1JJ7)*! to generate a 
homology model for the TAP2 NBD using the program Modeller”. We also gen- 
erated a homology model of the TAP1 and TAP2 TMDs using the half-transporter 
subunit of human ABCB10 (PDB accession number 4AYT) as the source struc- 
ture*’, We manually docked these poly-alanine models into our final cryo-EM 
map and rebuilt each model in Coot’. 

Flow cytometry analysis of MHC I surface expression. MHC I surface expres- 
sion was analysed using the phycoerythrin-coupled antibody W6/32 (Abcam), 
which recognizes a monomorphic epitope shared among MHC class I mole- 
cules. Genes encoding HSV-1 wild-type ICP47 and the ‘turn-to-helix’ (TtH) 
mutant were cloned into a modified version of the pCDNA 3.1 vector (Life 
Technologies) that added a C-terminal enhanced GFP tag. HeLa cells (ATCC 
CCL-2) were seeded in six-well plates at a density of 5 x 10° cells per well and 
transfected with wild-type ICP47, the TtH mutant, or the empty vector. Cells 
were detached from the plate using trypsin-EDTA (0.05%) at 72 h after trans- 
fection and washed in ice-cold FACS buffer (Ca**/Mg**-free phosphate buffer, 
10% FCS, 1% sodium azide) and centrifuged at 400g for 5 min at 4°C. Non- 
specific binding was blocked by incubating the cells with phosphate buffer con- 
taining 5% (w/v) bovine serum albumin (BSA) for 15 min on ice. Antibody was 
added at 51g ml and incubated for 30 min at 4°C in the dark. Subsequently, 
the cells were washed three times in FACS buffer, resuspended at a density of 
3 x 106 cells per millilitre, and counted using a BD LSR II Flow Cytometer (BD 
Biosciences). The cells were analysed at wavelengths 405 nm for DAPI nuclear 
stain, 488 nm for GFP fluorescence, and 561 nm for phycoerythrin fluorescence. 
Only live, single cells with the same levels of GFP fluorescence were used in 
phycoerythrin gating to compare MHC class I expression. The flow cytometry 
data were analysed using FlowJo 10.1 single cell analysis software (Tree Star). 
All experiments were repeated three times. The cell line was tested for myco- 
plasma contamination by PCR using a Universal Mycoplasma Detection Kit 
(ATCC 30-1012K). 

Figure preparation. Figures were prepared using the programs PyMOL*, 
Chimera*“®, and FlowJo 10.1. 
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Extended Data Figure 2 | Cryo-EM data processing flowchart. 
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Extended Data Figure 3 | FSC indicating the resolution of the density map. FSC plots were generated between reconstructions from random halves 
of the data. The frequency at which the dashed line passes through FSC = 0.143 indicates the reported resolution. Corresponding values are given in 
Extended Data Table 1. 
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Extended Data Table 1 | Summary of cryo-EM data 
Data collection 


Microscope Titan Krios I, 300 keV (FEI) 

Detector K2 Summit direct electron detector (Gatan) 

Pixel size 1.35A 

Energy filter 20 eV (Gatan) 
Dataset # 1 

Movies 692 

Frames 20 

Dose 2.2 electrons/A /frame 

Defocus range -1.6 to -3.0 um 
Dataset # 2 

Movies 1524 

Frames 40 

Dose 2.2 electrons/A /frame 

Defocus range -2.0 to -5.0 um 
Dataset # 3 

Movies 2947 

Frames 40 

Dose 2.2 electrons/A /frame 

Defocus range -1.5 to -3.5 um 
Final Reconstruction 

Number of particles 139,293 

Accuracy of rotations 2.78 degrees 

Accuracy of translations 0.98 pixels 

Overall resolution 6.5A 


B-factor correction -500 A’ 
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Structure of the E6/E6AP/p53 complex required for 
HPV-mediated degradation of p53 


Denise Martinez-Zapien!, Francesc Xavier Ruiz?, Juline Poirson!, André Mitschler?, Juan Ramirez!, Anne Forster!, 
Alexandra Cousido-Siah*, Murielle Masson!, Scott Vande Pol’, Alberto Podjarny’, Gilles Travé! & Katia Zanier! 


The p53 pro-apoptotic tumour suppressor is mutated or 
functionally altered in most cancers. In epithelial tumours 
induced by ‘high-risk’ mucosal human papilloma viruses, 
including human cervical carcinoma and a growing number of 
head-and-neck cancers’, p53 is degraded by the viral oncoprotein 
E6 (ref. 2). In this process, E6 binds to a short leucine (L)-rich LxxLL 
consensus sequence within the cellular ubiquitin ligase E6AP°. 
Subsequently, the E6/E6AP heterodimer recruits and degrades 
p53 (ref. 4). Neither E6 nor E6AP are separately able to recruit 
p53 (refs 3, 5), and the precise mode of assembly of E6, E6AP and 
p53 is unknown. Here we solve the crystal structure of a ternary 
complex comprising full-length human papilloma virus type 16 
(HPV-16) E6, the LxxLL motif of E6AP and the core domain of 
p53. The LxxLL motif of E6AP renders the conformation of E6 
competent for interaction with p53 by structuring a p53-binding 
cleft on E6. Mutagenesis of critical positions at the E6-p53 
interface disrupts p53 degradation. The E6-binding site of p53 
is distal from previously described DNA- and protein-binding 
surfaces of the core domain. This suggests that, in principle, E6 
may avoid competition with cellular factors by targeting both free 
and bound p53 molecules. The E6/E6AP/p53 complex represents 
a prototype of viral hijacking of both the ubiquitin-mediated 
protein degradation pathway and the p53 tumour suppressor 
pathway. The present structure provides a framework for the 
design of inhibitory therapeutic strategies against oncogenesis 
mediated by human papilloma virus. 

Papilloma viruses are small DNA viruses, which infect the mucosal 
and cutaneous epithelia of most vertebrate species. HPV-16 is the most 
prevalent and best studied ‘high-risk mucosal human papilloma virus 
(hrm-HPV), responsible for 50% of cervical carcinomas and for most 
HPV-positive head-and-neck cancers!. The HPV oncoproteins E6 and 
E7 recognize numerous host proteins, in large part by hijacking cellular 
domain-motif interaction networks®. In particular, most mucosal and 
cutaneous E6 proteins recognize cellular acidic leucine (L)-rich LxxLL 
motifs (reviewed in ref. 7). In a recent structural study’, we have shown 
that LxxLL motifs bind to a conserved pocket of E6, which is contrib- 
uted by the protein’s amino (N)- and carboxy (C)-terminal zinc-binding 
domains (E6N and E6C) and helix linker. 

In E6-mediated degradation of p53, hrm-HPV E6 proteins interact 
with the LxxLL motif of E6AP, leading to recruitment and polyubiq- 
uitination of p53. The isolated LxxLL peptide of E6AP (named e6ap 
from hereon) is sufficient to render E6 liable to interact with p53 
(ref. 5). Furthermore, several studies indicate that the ‘core (DNA 
binding) domain of p53 is required for the interaction with E6/ 
E6AP®!!. We thus proceeded to reconstitute a minimal E6/E6AP/ 
p53 ternary complex in vitro (Extended Data Fig. 1). The solubility 
enhanced HPV-16 E6 4C/4S mutant (named E6 from hereon), which 
degrades p53 with wild-type efficiency’, was assembled with e6ap 


(sequence E!L?T?L4Q°E°L’L$G°E ER!) fused to a crystallization- 
prone mutant of the maltose binding protein (MBP)* (Extended 
Data Fig. 2). The resulting E6/MBP-e6ap heterodimer (named 
E6/e6ap from hereon) was found to interact with the isolated p53 


a Ternary complex 


b HPV-16 E6 
‘p53 cleft’ cen 


Figure 1 | Structure of the HPV-16 E6/e6ap/p53core ternary complex. 

a, Ribbon (top) and surface (bottom) representations. Green: e6ap peptide; 
gold: HPV-16 E6; blue: p53core. Spheres: zinc atoms. Boxes indicate 
sub-interfaces I-III (expanded in Fig. 2b). b, Surface representation of 

E6 coloured for residues in atomic contact with p53 (light blue), e6ap 
(light green), and both p53 and e6ap (dark grey). E6N and E6C: N- and 
C-terminal zinc-binding domains; HL: helix linker. E6é molecules on the 
left side of a and b are in the same orientation. 


lEquipe labellisée Ligue, Biotechnologie et signalisation cellulaire UMR 7242, Ecole Superieure de Biotechnologie de Strasbourg, Boulevard Sébastien Brant, BP 10413, F-67412 Illkirch, France. 
2Institut de Génétique et de Biologie Moléculaire et Cellulaire (IGBMC)/INSERM U964/CNRS UMR 7104/Université de Strasbourg, 1 rue Laurent Fries, BP 10142, F-67404 lllkirch, France. 
3Department of Pathology, University of Virginia, PO Box 800904, Charlottesville, Virginia 22908-0904, USA. 
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Figure 2 | Intermolecular contacts at the E6-p53 interface. a, Alignment 
of HPV E6 sequences from hrm- and lrm-HPV groups. Histograms: 
burial of residues at the interface with e6ap (light green) and p53 (light 
blue). Asterisks indicate positions conserved in hrm-HPV. b, Views of 
sub-interfaces I-III. Red spheres: water molecules; thick dashed lines: 
direct polar interactions; thin dotted lines: water-mediated interactions. 


core domain (residues 94-292, named p53core from hereon) by 
gel filtration chromatography and isothermal titration calorimetry 
(dissociation constant Ky= 221m) (Extended Data Fig. 3 and 
Extended Data Table 1). In vivo, this affinity of p53 for E6/E6AP is 
likely to be enhanced by avidity effects, since p53 is tetrameric and 
E6AP can form trimers)’. 

The E6/e6ap/p53core ternary complex raised several crystals dif- 
fracting up to 2.25A resolution using synchrotron radiation. This 
allowed structure determination by molecular replacement (Fig. la 
and Extended Data Table 2). The asymmetric unit of the crystal com- 
prises two E6/e6ap/p53core heterotrimers, which contact each other 
mostly via MBP and display nearly identical structures except for the 
relative orientation of the MBP moieties (Extended Data Fig. 4). The 
structures of p53core and E6/e6ap observed in the heterotrimers are 
superimposable with previous structures of p53core and of E6/e6ap 
heterodimer, except for residues 1-8 of E6 and 10-12 of e6ap, which 
change conformation upon p53 binding (Extended Data Fig. 5). The 
similarities between the structures of the two heterotrimers in the 
crystal and previously solved structures of separate elements suggest 
that MBP does not significantly alter the overall conformation of the 
E6/e6ap/p53core complex. 

In each heterotrimer p53core binds to a cleft, which is formed by 
the E6N and E6C domains and held in place by contacts tethering the 
domains to the e6ap peptide (Figs 1b and 2a). The E6-p53 interface 
covers approximately 1,200 A”. The C terminus of the e6ap peptide 
(residues 10-12, Extended Data Fig. 6a) also lies proximal to p53core 
(Extended Data Fig. 6b), but its structure is poorly defined, possibly 
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Upper and lower case fonts: residues mediating polar interactions via 
side chain and backbone, respectively. c, Key interaction networks. Gold 
fonts: E6 residues; blue fonts: p53 residues; green boxed black fonts: e6ap 
residues; thick grey dashed lines: non-polar interactions; black lines: 
polar interactions involving side-chain groups; black dotted lines: polar 
interactions involving backbone groups; red circles: water molecules. 


owing to an influence of the adjacent MBP tag. Nevertheless neither 
point mutations at residues 10-12 of e6ap nor extension of the e6ap 
peptide’s C-terminal boundary altered p53 binding affinity (Extended 
Data Fig. 6c and Extended Data Table 1), suggesting that the e6ap 
C terminus does not contribute significant intermolecular contacts 
to p53. 

The E6-p53 interface can be divided into three sub-interfaces (Figs la 
and 2b). Sub-interface I is dominated by polar interactions and brings 
together residues in the N-terminal arm and a1 helix of E6 and resi- 
dues of the N-terminal arm, 81 and 610 strands of p53core (Fig. 2b, c). 
E6 residues Glu7 and Glu18 establish a bi-dentate salt bridge with 
Lys101 of p53 as well as water-mediated interactions with other p53 
residues. In particular, Glu18 contributes to a network involving side 
chains and backbone groups of Gln14, Arg10 and Lys11 of E6, and 
of Thr102 and Asn268 of p53. Consistently, E6 mutations E18A and 
E18R impair ternary complex assembly (Fig. 3a) and p53 degradation 
(Fig. 3b). Furthermore, Gln104 and Gly105 of p53 hydrogen bond 
to the backbone of E6 residues Arg8 and Gln6, respectively, thus 
altering the conformation of the N-terminal arm of E6 (residues 1-8) 
(Extended Data Fig. 5). 

Sub-interface I] mainly involves the «2 helix of the E6N domain and 
the a2 helix of p53core, interacting through hydrophobic, charged and 
polar contacts (Fig. 2b, c). E6 residue Phe47 intercalates between p53 
residues Alal129 and Arg290. In turn, Arg290 establishes a salt bridge 
with Asp44 of E6. These key interactions explain the reported dom- 
inant-negative phenotype of the E6 F47R mutant, which is defective 
for p53 degradation, restores high p53 levels and drives senescence4. 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


Consistently, here we find that E6 F47R is defective for interaction with 
both p53core (Extended Data Fig. 3a) and full-length p53 (Fig. 3a). 
Besides Phe47, other E6 residues in sub-interface II (Ile23, His24 and 
Tyr43) provide hydrophobic contacts, whereas E6 residue Asp49 estab- 
lishes polar interactions with His115 in the L1 loop of p53. Mutations 
disrupting the Asp44—Arg290 and Asp49-His115 interactions (D44R 
and D49A in E6 and the R290E in p53) impair both p53 ternary com- 
plex assembly (Fig. 3a) and p53 degradation (Fig. 3b). By contrast, 
creation of a swapped Glu290-Arg44 salt bridge, by combining 
D44R E6 and R290E p53 mutants, partly restores p53 degradation 
(Fig. 3b). 

Finally, sub-interface II encompasses hydrophobic interactions 
between E6C residues Leu100 and Prol12 and p53core surface 
residues Leull4 and Trp146, as well as several water-mediated 
backbone-to-backbone contacts (Fig. 2b, c). Notably, Leu100 and 
Pro112 of E6C are proximal to conserved Arg102, which shapes the p53 
binding cleft by mediating crucial interactions with the E6N domain 
and the e6ap peptide’. 

Degradation of p53 is a hallmark activity of hrm-HPV E6 
proteins'>"!’. The structure explains the critical contributions 
towards p53 degradation of particular residues, which are differ- 
entially conserved in high- versus low-risk mucosal (Irm) HPV 
E6. In sub-interface I, most conserved positions in hrm-HPV E6 
proteins correspond to Phe2, Pro5, Glu7, Arg8 and Pro9 (Fig. 2a). 
While Glu7 mediates direct contacts to p53, Phe2, Pro5, Arg8 and 
Pro9 play an indirect role at the interface by shaping the conforma- 
tion of the N-terminal region (Extended Data Fig. 7). Indeed, site- 
directed mutagenesis of Phe2, Pro5, Arg8 or Pro9 (refs 18-20) 
impairs ternary complex assembly and p53 degradation activities. 
In sub-interface II, most conserved positions in hrm-HPV corre- 
spond to Phe47, Asp44 and Asp49 (Fig. 2a), which are here found 
to mediate crucial contacts to p53. These residues were formerly 
found to be important for p53 degradation'*!*?!, Notably, Phe47 
and Asp44 also participate in the dimerization of the E6N domain 
and subsequent in vitro self-association of the entire E6 protein’’. 
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Therefore, E6 self-association and E6 binding to p53 are two distinct 
and competing processes mediated by partly overlapping interaction 
surfaces. 

The LxxLL peptide of E6AP is absolutely required for E6 binding to 
p53, yet it does not contribute significant contacts to p53. Furthermore, 
in the absence of the LxxLL motif of E6AP, the few interactions con- 
necting E6N, linker helix and E6C should be insufficient to maintain 
the overall E6 architecture observed in the E6/e6ap heterodimer struc- 
ture (see ref. 8 for further discussion). These observations indicate that 
the LxxLL motif structures the p53 binding cleft on E6, thereby render- 
ing E6 competent for interaction with p53. Interestingly, we have shown 
that an in vitro selected peptide, targeting the LxxLL pocket of HPV-16 
E6, induces recruitment of p53 to E6 (refs 22, 23). Consequently, cellu- 
lar proteins other than E6AP, which bind to the LxxLL pocket, might 
also promote the E6—p53 interaction. 

The p53core is both a DNA binding domain and a protein-protein 
interaction hub. E6 interacts with the N-terminal arm of p53core and 
one of the edges of the B-sandwich. Previous mutagenesis studies 
had already suggested a role for these regions in E6 binding”!?*. 
The E6-binding interface on p53core is distal from both the DNA- 
binding region and protein-binding surfaces observed in all other 
solved complexes of p53core (Fig. 4a). Indeed, previously described 
protein-binding interfaces of p53core all overlap, albeit to differ- 
ent extents, with the DNA-binding region. In particular, the large 
T antigen (LTag) of oncovirus SV40, which does not degrade p53, 
buries the entire DNA binding interface of p53core, thereby inhib- 
iting p53 trans-activation activity*°. Consistent with these obser- 
vations, two frequent cancer-associated mutations (R273C and 
R273L), affecting a prominent DNA-binding arginine residue of 
p53core, abolish binding to SV 40 LTag, but do not alter the interac- 
tion with E6/E6AP”°. Therefore, E6 might avoid competition with 
cellular factors by targeting both ‘free’ p53 as well as p53 bound to 
DNA or other proteins (Fig. 4b). This, along with the irreversible 
character of the degradation process, renders E6 a potent inactivator 
of p53 functions. 
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Figure 4 | p53core targeting by HPV E6. a, Complexes of p53core 
(surface representation) with protein or DNA partners (ribbon 
representation). Left: p53core bound to HPV E6/e6ap. Right: p53core 
bound to 53BP2, 53BP1 and BCL-xL cellular proteins, to adenovirus 
SV40 LTag oncoprotein and to DNA. p53core binding interfaces: E6 (light 
orange), other proteins (pink) and DNA (cyan). b, Cartoon summarizing 


Recent studies employing pro-apoptotic peptides” as well as small 
molecules””* directed against the LxxLL pocket have provided exper- 
imental evidence that this pocket is druggable. The p53-binding cleft 
observed in the present structure may represent a second potential 
binding site for drugs. Combinatorial strategies targeting both the 
LxxLL pocket and the p53-binding cleft could result in efficient dis- 
ruption of the E6/E6AP/p53 complex. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 


Received 11 August; accepted 23 November 2015. 
Published online 20 January 2016. 


1. Bosch, F. X. et al. Comprehensive control of human papillomavirus infections 
and related diseases. Vaccine 31 (Suppl. 7), HI-H31 (2013). 

2. Scheffner, M., Werness, B. A., Huibregtse, J. M., Levine, A. J. & Howley, P. M. The 
E6 oncoprotein encoded by human papillomavirus types 16 and 18 promotes 
the degradation of p53. Cell 63, 1129-1136 (1990). 

3. Huibregtse, J. M., Scheffner, M. & Howley, P. M. Localization of the E6-AP 
regions that direct human papillomavirus E6 binding, association with p53, 
and ubiquitination of associated proteins. Mol. Cell. Biol. 13, 4918-4927 
(1993). 

4. Scheffner, M., Nuber, U. & Huibregtse, J. M. Protein ubiquitination involving 
an E1-E2-E3 enzyme ubiquitin thioester cascade. Nature 373, 81-83 
(1995). 

5. Ansari, T., Brimer, N. & Vande Pol, S. B. Peptide interactions stabilize and 
restructure human papillomavirus type 16 E6 to interact with p53. J. Virol. 86, 
11386-11391 (2012). 

6. Davey, N. E., Travé, G. & Gibson, T. J. How viruses hijack cell regulation. Trends 
Biochem. Sci. 36, 159-169 (2011). 

7. Vande Pol, S. B. & Klingelhutz, A. J. Papillomavirus E6 oncoproteins. Virology 
445, 115-137 (2013). 

8. Zanier, K. et al. Structural basis for hijacking of cellular LxxLL motifs by 
papillomavirus E6 oncoproteins. Science 339, 694-698 (2013). 

9. Gu, J., Rubin, R. M. & Yuan, Z. M. A sequence element of p53 that determines 
its susceptibility to viral oncoprotein-targeted degradation. Oncogene 20, 
3519-3527 (2001). 

10. Li, X. & Coffino, P. High-risk human papillomavirus E6 protein has two distinct 
binding sites within p53, of which only one determines degradation. J. Virol. 
70, 4509-4516 (1996). 

11. Bernard, X. et al. Proteasomal degradation of p53 by human papillomavirus E6 
oncoprotein relies on the structural integrity of p53 core domain. PLoS ONE 6, 
e25981 (2011). 

12. Zanier, K. et al. Solution structure analysis of the HPV16 E6 oncoprotein reveals 
a self-association mechanism required for E6-mediated degradation of p53. 
Structure 20, 604-617 (2012). 


544 | NATURE | VOL 529 | 28 JANUARY 2016 


E6AP 


p53 
degradation 


v- 
e 
e 


how the LxxLL motif shapes the p53 binding cleft on E6, which recognizes 
a distinct interface on p53core, enabling targeting of both free and bound 
p53 molecules. Tetrameric p53 is shown with one subunit coloured for 
interfaces binding to HPV E6 (yellow), DNA (cyan and purple), and other 
host proteins (pink and purple). 


13. Ronchi, V. P, Klein, J. M., Edwards, D. J. & Haas, A. L. The active form of 
E6-associated protein (E6AP)/UBE3A ubiquitin ligase is an oligomer. J. Biol. 
Chem. 289, 1033-1048 (2014). 

14. Ristriani, T., Fournane, S., Orfanoudakis, G., Travé, G. & Masson, M. A 
single-codon mutation converts HPV16 E6 oncoprotein into a potential tumor 
suppressor, which induces p53-dependent senescence of HPV-positive HeLa 
cervical cancer cells. Oncogene 28, 762-772 (2009). 

15. Fu, L. et al. Degradation of p53 by human Alphapapillomavirus E6 proteins 
shows a stronger correlation with phylogeny than oncogenicity. PLoS ONE 5, 
e12816 (2010). 

16. Mespléde, T. et al. p53 degradation activity, expression, and subcellular 
localization of E6 proteins from 29 human papillomavirus genotypes. J. Virol. 
86, 94-107 (2012). 

17. Hiller, T., Poppelreuther, S., Stubenrauch, F. & Iftner, T. Comparative analysis of 
19 genital human papillomavirus types with regard to p53 degradation, 
immortalization, phylogeny, and epidemiologic risk classification. Cancer 
Epidemiol. Biomark. Prev. 15, 1262-1267 (2006). 

18. Foster, S. A., Demers, G. W., Etscheid, B. G. & Galloway, D. A. The ability of 
human papillomavirus E6 proteins to target p53 for degradation in vivo 
correlates with their ability to abrogate actinomycin D-induced growth arrest. 
J. Virol. 68, 5698-5705 (1994). 

19. Cooper, B. et al. Requirement of E6AP and the features of human 
papillomavirus E6 necessary to support degradation of p53. Virology 306, 
87-99 (2003). 

20. Liu, Y. et a/. Multiple functions of human papillomavirus type 16 E6 contribute 
to the immortalization of mammary epithelial cells. J. Virol. 73, 7297-7307 
(1999). 

21. Nakagawa, S. et al. Mutational analysis of human papillomavirus type 16 E6 
protein: transforming function for human cells and degradation of p53 in vitro. 
Virology 212, 535-542 (1995). 

22. Zanier, K. et al. The E6AP binding pocket of the HPV16 E6 oncoprotein 
provides a docking site for a small inhibitory peptide unrelated to 
E6AP, indicating druggability of E6. PLoS ONE 9, e112514 
(2014). 

23. Stutz, C. et al. Intracellular analysis of the interaction between the human 
papillomavirus type 16 E6 oncoprotein and inhibitory peptides. PLoS ONE 10, 
e0132339 (2015). 

24. Hengstermann, A,, Linares, L. K., Ciechanover, A., Whitaker, N. J. & Scheffner, M. 
Complete switch from Mdm2 to human papillomavirus E6-mediated 
degradation of p53 in cervical cancer cells. Proc. Nat! Acad. Sci. USA 98, 
1218-1223 (2001). 

25. Lilyestrom, W., Klein, M. G., Zhang, R., Joachimiak, A. & Chen, X. S. Crystal 
structure of SV40 large T-antigen bound to p53: interplay between a viral 
oncoprotein and a cellular tumor suppressor. Genes Dev. 20, 2373-2382 
(2006). 

26. Scheffner, M., Takahashi, T., Huibregtse, J. M., Minna, J. D. & Howley, P. M. 
Interaction of the human papillomavirus type 16 E6 oncoprotein with 
wild-type and mutant human p53 proteins. J. Virol. 66, 5100-5105 
(1992). 

27. Cherry, J. J. et al. Structure based identification and characterization of 
flavonoids that disrupt human papillomavirus-16 E6 function. PLoS ONE 8, 
e84506 (2013). 


© 2016 Macmillan Publishers Limited. All rights reserved 


28. Malecka, K. A. et al. Identification and characterization of small molecule 
human papillomavirus E6 inhibitors. ACS Chem. Biol. 9, 1603-1612 
(2014). 


Supplementary Information is available in the online version of the paper. 


Acknowledgements This work received institutional support from le Centre 
National de la Recherche Scientifique (CNRS), Université de Strasbourg, 
l'Institut National de la Santé et de la Recherche Médicale (INSERM) and 
Région Alsace. The work was supported by grants from Ligue contre le 

Cancer, National Institutes of Health (grant RO1CA134737 to S.V.P), l'Agence 
Nationale de la Recherche (ANR-13-JSV8-004-01), Instruct (ESFRI), the French 
Infrastructure for Integrated Structural Biology (FRISBI) and Fondation pour La 
Recherche Medicale (fellowship to F.X.R.F.). We thank P. Poussin-Courmontagne, 
E. Ennifar, V. Olieric and B. Kieffer for advice. The authors declare that the 


LETTER 


content is solely their responsibility and does not represent the official views of 
the National Institutes of Health. 


Author Contributions D.M.Z., J.P. A.M., J.R.R., A.F., A.C.S. and K.Z. performed 
experiments; F.X.R., A.P., D.M.Z. and K.Z. performed structure determination; 
D.M.Z., FX.R., J.P., A.M., A.P. and K.Z. analysed the data; D.M.Z., J.P. and 

K.Z. prepared figures; K.Z. and GT wrote the manuscript together with 
comments from all authors; M.M., S.V.P., A.P., GT and K.Z. supervised the work. 


Author Information Coordinates and structure factors have been deposited 

in the Protein Data Bank (PDB) under accession number 4XR8. Reprints and 
permissions information is available at www.nature.com/reprints. The authors 
declare no competing financial interests. Readers are welcome to comment 
on the online version of the paper. Correspondence and requests for materials 
should be addressed to K.Z. (zanier@unistra-fr) or G.T. (gilles.trave@unistra.fr). 


28 JANUARY 2016 | VOL 529 | NATURE | 545 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


METHODS 


No statistical methods were used to predetermine sample size. The experiments 
were not randomized. The investigators were not blinded to allocation during 
experiments and outcome assessment. 

Preparation of protein samples. DNA constructs. Residues 403-414 of human 
E6AP (peptide sequence E'L?T°L*QSE°L’L°G°EM ER!) were cloned via a 
three-alanine linker at the C terminus of a mutant MBP used to promote crystal- 
lization’. The point mutations introduced in MBP (D83A, K84A, K240A, E360A, 
K363A and D364A) have been previously described to increase the propensity of 
MBP to crystallize””. The E6 4C/4S mutant of HPV-16 (ref. 12) (named E6 from 
hereon) and the core (or DNA binding) domain of human p53 (residues 94-311) 
were cloned into the pETM-41 vector containing an N-terminal Hisg- MBP tag 
followed by a TEV cleavage site. 

Protein expression and purification. MBP-e6ap, Hisg-MBP-tev-E6 and Hisg- 

MBP-tev-p53core constructs were overexpressed separately in Escherichia coli 
BL21(DE3) cells at 15°C for 18h. All constructs were purified separately by amyl- 
ose affinity chromatography in buffer A (50mM Tris pH 6.8, 400 mM NaCl, and 
2mM DTT). To remove soluble aggregates, all affinity purified samples were ultra- 
centrifuged at 110,000 gin a swing SW41 rotor (Beckman) for 16h at 4°C. Hence 
p53core and E6 samples were digested by TEV protease. In the case of p53core, 
after TEV digestion samples were additionally purified on a heparin column. The 
resulting MBP-e6ap, E6 and p53 samples were thus concentrated and loaded sep- 
arately onto a Superdex 75 HiLoad 16/60 gel filtration column (GE Healthcare) 
equilibrated in buffer A. All purification buffers were filtered, degassed and 
saturated with argon. 
Crystallization. The E6/e6ap/p53core complex was reconstituted by mixing 
MBP-e6ap and E6 and p53core samples in a 1:1:1 stoichiometric ratio in buffer B 
(50mM Tris pH 6.8, 200 mM NaCl, 2mM DTT, 5mM maltose) and concentrated 
to 18mg ml’ before crystallization. Crystallization conditions were screened using 
commercially available kits (Qiagen, Hampton Research, Emerald Biosystems) 
by the sitting-drop vapour-diffusion method in 96-well MRC 2-drop plates 
(SWISSCI), employing a Mosquito robot (TTP Labtech). Initial crystals were 
obtained and used as seeds during further optimization steps. 

After optimization, crystals (125j1m x 80j1m x 801m) grew in sitting drops 

made from 400 nl of protein solution at 18 mg/ml, 370 nl of reservoir solution 
containing 7.5% polyethylene glycol (PEG) 20 K, 50mM MES (pH 6.5) and 30nl of 
seeds. Drops were equilibrated against 50 11 of reservoir solution at 290 K. Crystals 
were sequentially transferred through two cryo-solutions of reservoir solution sup- 
plemented with 15% (v/v) and 25% (v/v) of PEG 200, respectively. The crystals were 
flash-cooled and stored in liquid nitrogen. 
Data collection and processing, and structure determination. X-ray diffrac- 
tion data were collected on the XO6DA beamline at the Synchrotron Swiss Light 
Source (Villigen, Switzerland). Data were acquired from a single cryo-cooled 
crystal (100 K) on a Pilatus-2M detector. The 200° data were collected up to a 
resolution of 2.25 A using 0.5° rotation and 0.8s exposure time with 20% beam 
attenuation for each image. The data were indexed, processed and scaled using 
HKL2000 (ref. 30). The crystals belonged to the monoclinic space group P2, with 
unit cell parameters a= 78.17 A, b= 129.37 A, c=82.17A and B= 92.4°, with a 
refined crystal mosaicity of 0.28-0.34°. The asymmetric unit contained two copies 
of the E6/MBP-e6ap/p53core heterotrimer, with a corresponding Matthew’s coef- 
ficient of 2.48 A? per dalton and a solvent content of 50.34%. The structure was 
solved by two sequential molecular replacements using Phaser*! and the structures 
of MBP, E6/e6ap complex and p53core as templates (PDB accession numbers 4GIZ 
and 1TUP, respectively). Crystallographic refinement involved repeated cycles of 
conjugate-gradient energy minimization and temperature-factor refinement, and 
was performed using PHENIX* followed by iterative model building in Coot™. 
All the disordered residues were fitted to the built electron density map, except 
from e6ap residue Arg12 in one of the two heterotrimers (corresponding to chain 
B residue Arg383 in the PDB file). 

Data collection and refinement statistics are summarized in Extended Data 
Table 2. The quality of the refined models was assessed using MOLPROBITY™. 
All molecular graphics figures were made using PYMOL*. E6 residues at interface 
regions (Figs 1b and 2a) were identified by the observation of an increase in solvent 
accessibility obtained upon removal of e6ap or p53core from the PDB file of the 
ternary complex. Positive ‘change in exposure’ values (Fig. 2a) indicate E6 residues 
in atomic contact with or p53. 

The refined model and the structure factor amplitudes have been deposited in 
the PDB under accession number 4XR8. 

DNA constructs for triple pull-down and p53 degradation assays. HPV-16 E6 
mutations were inserted in the background of the E6é 4C/4S construct to prevent 
cross-linking of Eé molecules through intermolecular cysteine bridges. We have 


previously shown that E6 4C/4S and wild-type HPV-16 E6 have indistinguisha- 
ble p53 degradation activities in vitro!. The p53 mutations were inserted in the 
background of wild-type full-length human p53. All DNA constructs were verified 
by sequencing. 

Triple pull-downs. DNA constructs encoding for full-length HPV-16 E6 and p53, 
and a large fragment of E6AP (residues 291-875 of isoform II), were inserted into 
the final destination vectors by Gateway cloning (Invitrogen). HPV-16 E6 and 
E6AP were inserted into the GPCA pSPICA-N2 and pSPICA-N1 vectors respec- 
tively. GPCA vectors pSPICA-N1 and pSPICA-N2 (both derived from the pCiNeo 
mammalian expression vector) allow expression of test proteins as fusions to the 
C terminus of Glucl and Gluc2 complementary fragments of the Gaussia princeps 
luciferase, respectively*®. By contrast, p53 constructs were inserted in the BioEase- 
DEST vector, which incorporates a 72-amino-acid sequence from Klebsiella pneu- 
monia that directs in vivo biotinylation. 

Degradation assays. HPV-16 E6 and full-length p53 proteins were cloned in the 
pXJ40 vector. 

Triple pull-downs. HEK-293T cells were grown and maintained in Dulbecco's 
modified Eagle’s medium (DMEM), supplemented with 10% FCS and 501g mI! of 
gentamycin at 37°C with 5% CO, and 95% humidity. Cells were seeded in six-well 
plates at a concentration of 3 x 10° cells per well. After 24h, cells were transfected 
using JetPEI (Polyplus transfection) with 1 zg pSPICA-N2 plasmid expressing E6, 
1g of pSPICA-N1 plasmid expressing E6AP protein and 1 \1g of BioEase plasmid 
expressing p53. At 24h after transfection, cells were harvested and lysed by freeze- 
thawing in 100 11 of Renilla lysis buffer (Promega, E2820). Cellular lysates were 
cleared by centrifugation at 11,000 gin a microfuge for 15 min at 4°C. Subsequently, 
3011 of pre-equilibrated Streptavidin Mag Sepharose beads (GE Healthcare) were 
incubated with 6011 of cellular lysate supernatant for 2h at 4°C, thereby allow- 
ing capture of biotinylated p53. Streptavidin beads were washed three times with 
TNE buffer (50 mM Tris-HCl pH 7.5, 150mM NaCl, 1% NP40, 1mM EDTA and 
protease inhibitor), resuspended with 25 11 of protein loading buffer and loaded 
ona 15% SDS-polyacrylamide gel electrophoresis (SDS-PAGE) gel. E6 and E6AP 
proteins were detected by western blotting using rabbit anti-Gluc antibody (NEB, 
reference E8023S), whereas p53 was detected using a mouse anti-p53 DO-1 anti- 
body (Life Technologies, reference 13-4000). The immunoreactive bands were 
visualized using WesternBright Sirius (Advansta) and an LAS 4000 camera. Bands 
were quantified using ImageGauge software (Fujifilm). Error bars, s.d. from three 
independent experiments. 

Cell lines. The HEK-293T cell line was provided by P. Charneau. These cells were 
authenticated and tested to be mycoplasma free. 

p53 degradation assays. E6 and p53 proteins were in vitro translated in the pres- 
ence of [*°$]methionine using the TNT T7 coupled rabbit reticulocyte system. The 
p53 degradation reactions were performed in 1011 volumes by incubating 2 11 of 
p53 translation product with either 5 or 2.51] of E6 translation products at 28°C 
for 2h according to previously described protocols”. Reactions were resolved on a 
15% SDS-PAGE gel. Gels were exposed to a PhosphorImager screen and scanned 
using a Typhoon FLA 9500 imaging system (GE Healthcare). Reactive p53 bands 
were quantified using ImageQuant TL software (GE Healthcare). 

The p53 degradation activity values reported in Fig. 3 were derived using the 
formula (Io — D)/Io, where I is the intensity of the p53 double band after incubation 
with E6 and Jp the p53 signal in the input lane. The p53 degradation activity values 
were normalized to 100% for the reference proteins (E6 4C/4S/WT p53). For p53 
mutant proteins (Fig. 3b), besides the Jp input control, the J,» control (correspond- 
ing to p53 at time=2h in the absence of E6) was added to check for the intrinsic 
stability of p53 mutants. Error bars, s.d. from three independent experiments. 
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Extended Data Figure 1 | Constructs of the minimal HPV-16 E6/E6AP/ —_E6 4C/4S construct!* (named E6 from hereon) comprising four cysteine to 
p53 ternary complex. Green: E6AP residues 403-414 (e6ap 1-12) fused serine substitutions (cyan marks) that suppress aggregation mediated by 
to the C terminus of MBP (pink) via a AAA linker (red); gold: the HPV-16 _ disulfide cross-bridging; blue: p53 core domain (p53core). 
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Extended Data Figure 2 | Assembly of the E6/MBP-e6ap heterodimer. 
a, Dynamic light scattering (DLS) analysis of E6/MBP-e6ap samples. 
Histograms report the average hydrodynamic radii of the particles, 
whereas error bars indicate size polydispersity. Numbers above the 
histograms indicate molecular mass estimates assuming a spherical model. 
This analysis shows that binding to MBP-e6ap enhances the solubility of 
E6, which, in the unbound state, displays a solubility threshold of 501M. 
However, in the case of E6/MBP-e6ap samples (grey histograms), particle 
size increases with raising concentration. By contrast, introduction of the 
F47R mutation in E6 (green histograms) stabilizes particle size to values 
close to what is expected for a simple heterodimer (~60 kDa). Therefore 
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we conclude that, despite the increase in solubility, Eé/MBP-e6ap still 
undergoes weak self-association via the E6N region hosting Phe47 

(see ref. 12 for further discussion). b, Gel filtration analysis of E6/ 
MBP-e6ap samples. The elution volumes for molecular size markers are 
reported on top of the figure. The expected elution volumes of a simple 
E6/MBP-e6ap heterodimer (1x, 60 kDa) and of a dimer of heterodimers 
(2x, 120 kDa) are indicated. Note the relative small shift in the elution 
volumes of the different samples compared with the differences in the 
hydrodynamic radii (a). This suggests that oligomers of the E6/MBP-e6ap 
heterodimer are rather weak and dissociate on the gel filtration column. 
See also Supplementary Methods. 
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Extended Data Figure 3 | Interaction of p53core with preformed E6/ 
MBP-e6ap heterodimer. a, Isothermal titration calorimetry experiments 
were performed by titrating increasing amounts of p53core into E6/MBP- 
e6ap heterodimer samples adjusted to a concentration of 45 1M, which 
limits heterodimer oligomerization. Note that the F47R mutation in E6 
abolishes binding to p53core. See also Extended Data Table 1. b, Top: 
comparison of gel filtration elution profiles of E6/MBP-e6ap/p53core (red 
dashed line) versus E6/MBP-e6ap samples (black line). Both samples were 
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adjusted to a concentration of 250 1M before loading onto the gel filtration 
column. The expected elution volumes for p53core (24kDa), monomeric 
E6/MBP-e6ap heterodimer (60 kDa) and E6/MBP-e6ap/p53core ternary 
complex (80 kDa) are indicated. Bottom: SDS-PAGE analysis of fractions 
comprising the elution peak of the ternary complex. Note the significant 
shift in the elution volumes of the main peak in the two chromatograms, 
indicating formation of a ternary complex. See also Supplementary 
Methods. 
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a heterotrimer A heterotrimer B 


heterotrimer A 
/ heterotrimer B 


180° 


Extended Data Figure 4 | Comparison of the structures of the two E6/ gold/blue) and B (grey). The backbone r.m.s.d. for the E6/e6ap/p53 regions 


MBP-e6ap/p53core heterotrimers (heterotrimers A and B) observed of the two heterotrimers was calculated by aligning backbone atoms of 

in the asymmetric unit. Green: e6ap fused to the C terminus of MBP residues (1) 12-136 of HPV-16 E6, (2) 371-379 of e6ap and (3) 109-191 of 
(pink) via the AAA linker (red); gold: HPV-16 E6; blue: p53core; spheres: p53core and found to correspond to 0.9 A. Regions displaying significant 
zinc atoms. a, Different orientation of MBP in the two heterotrimers, differences are boxed. These regions are the ill-defined C-terminal region 
which results from the different conformations of the two AAA linkers. of the e6ap peptide (see also Extended Data Fig. 6) and the a4 helix of E6, 
b, Superposition of the E6/e6ap/p53 regions of heterotrimers A (green/ which is not involved in the E6-p53 interface. 
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HPV16 E6 / eGap (heterotrimer A) 
HPV16 E6 / e6ap (heterodimer, 4GIZ) 


Extended Data Figure 5 | The structures of the E6/e6ap and p53core 
subunits of the ternary complex are superimposable with previously 
solved structures of the E6/e6ap heterodimer and p53core. Left: 
superposition of the previously solved E6/e6ap heterodimer (grey)® onto 
the E6/e6ap subunit of the ternary complex (heterotrimer A, gold/green) 


LETTER 


p53 core (heterotrimer A) 
p53 core (unbound, 20CJ) 


determined here. Dashed lines highlight regions of conformational 
change, namely the N terminus of HPV-16 E6 and the C terminus of the 
e6ap peptide. Right: superposition of previously solved p53core in the 
unbound state (heterotrimer A, grey)” onto the p53core subunit of the 
ternary complex bound (blue). 
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Extended Data Figure 6 | Contributions of the e6ap C-terminal region conformations for the side chains Glu10 and Arg12 of e6ap and Arg131 of 
to ternary complex interface. a, Sequence of wild-type e6ap peptides E6 are proposed in the model. c, Isothermal titration calorimetry curves 
used in the study. Green: wild-type e6ap(1-12) corresponding to the showing the interactions of p53core with pre-formed E6/e6ap complexes 
peptide used for crystallization of the ternary complex. Grey: wild-type bearing either point mutations within e6ap C terminus (E10A, E11A and 
e6ap(1-16) containing a four-amino-acid C-terminal extension. b, Electron R12A) or the wild-type e6ap(1-16) peptide construct. Note that neither 
density (2F, — F.) map (heterotrimer A) contoured at 1s level for e6ap the e6ap point mutations nor the e6ap C-terminal boundary affect the 
(green, left) and for selected E6 (orange) and p53 (blue) interface residues interaction between the E6/e6ap heterodimer and p53core. See also 
in the proximity of the C terminus of e6ap (right). Note the lack of Extended Data Table 1. 


electron density data for e6ap residues Glu10, Glul1 and Arg12. Two 
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Extended Data Figure 7 | Interactions mediated by hrm-HPV-conserved residues shaping the conformation of the N-terminal arm of HPV-16 E6. 
Whereas Phe2 contributes to tethering of the N-terminal region to the core of the E6N domain, residues Pro5, Arg8 and Tyr54 are involved in a triple 
stacking interaction. 
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HPV16 E6 mutants 


Extended Data Figure 8 | Expression levels of HPV-16 E6, E6AP and Gluc2 fragment) and 1 1g of BioEase plasmid (expressing N-terminally 
p53 proteins in triple pull-down assays. HEK293T were transfected with —_ biotinylated p53). Cell lysates were resolved by SDS-PAGE electrophoresis 
lug of pSPICA-N1 plasmid (expressing E6AP fused to the C terminus and E6, E6AP and p53 fusion proteins detected by western blotting. For gel 
of the G. princeps luciferase Glucl fragment), 1 jug pSPICA-N2 plasmid source data, see Supplementary Fig. 1. 


(expressing the E6 fused to the C terminus of the G. princeps luciferase 
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Extended Data Table 1 | Thermodynamic parameters of p53core binding to preformed E6/e6ap complex 


vba = 7 Kp AH TAS AG 
P ee (uM) (kcal/mol) (kcal/mol) (kcal/mol) 
wt e6ap(1-12) 0.826+0.016 1.8/2.4 22.1+4.5 -17.441.5 -0.926+0.140 -16.5 
wt e6ap(1-16) 0.516+0.046 1.8/2.7 20.5+3.9 -19.1+2.7 -1.100+40.299 -18.0 
E10A e6ap(1-12) 0.827+0.080 3.1/3.3 14.1+0.6 -17.1+0.5 -0.878+0.051 -16.2 
E11A e6ap(1-12) 0.679+0.146 2.6/4.0 14.3+4.6 -13.8+0.1 -0.599+0.012 -13.2 
R12A e6ap(1-12) 0.719+0.111 1.0/2.2 34.6+13.6 -13.942.7 -0.651+0.250 -13.2 


All isothermal titration calorimetry experiments were performed at 25 °C. 
*N refers to the molar ratio of p53core-E6/e6ap complex. 
{The c value is defined as: c=n[E6/e6ap]/Kg. 
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Extended Data Table 2 | Data collection and refinement statistics 


E6/e6ap/p53 
Data collection 
Space group P2i 
Cell dimensions 
a, b,c (A) 78.15 129.41 82.26 
a, B, y (°) 90, 92.4, 90 
Resolution (A) 50-2.25 (2.33-2.25) 
Repu Ol Recess 6.0 (68.4) 
ol 17.6 (1.8) 
Completeness (%) 99.5 (99.6) 
Redundancy 3.2 (3.1) 
Refinement 
Resolution (A) 50-2.25 
No. reflections TI2o2 
Ryork/ Reree (%) 19 .4/24.6 (20.9/33.0) 
No. atoms 12034 
Protein 11593 
Ligand/ion 84 
Water 354 
B-factors (A’) f 
Protein 50 
MBP-e6ap (chains A, B) 50,58 
p53 (chains C, D) 43,45 
E6 (chains F, H) 49,55 
Water 47 
R.m-.s deviations 
Bond lengths (A) 0.010 
Bond angles (°) 1 By 
Ramachandran 
No. residues * 1455 
Favored (%) 95.94 
Allowed (%) 3.99 
Outliers (%) 0.07 


*Highest resolution shell is shown in parenthesis. 
tValues refer to occupancy-weighted average B-factors. 
{The total number of residues in the chain cannot be analysed: phi and psi angles cannot be analysed for terminal residues, non-standard residues or incompletely modelled main chain residues. 
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Structure of a HOIP/E2~ubiquitin complex reveals 
RBR E3 ligase mechanism and regulation 


Bernhard C. Lechtenberg!, Akhil Rajput’, Ruslan Sanishvili*, Malgorzata K. Dobaczewskal, Carl F. Ware’, 


Peter D. Mace‘ & Stefan J. Riedl! 


Ubiquitination is a central process affecting all facets of cellular 
signalling and function’. A critical step in ubiquitination is the 
transfer of ubiquitin from an E2 ubiquitin-conjugating enzyme 
to a substrate or a growing ubiquitin chain, which is mediated 
by E3 ubiquitin ligases. RING-type E3 ligases typically facilitate 
the transfer of ubiquitin from the E2 directly to the substrate”. 
The RING-between-RING (RBR) family of RING-type E3 ligases, 
however, breaks this paradigm by forming a covalent intermediate 
with ubiquitin similarly to HECT-type E3 ligases*°. The RBR 
family includes Parkin‘ and HOIP, the central catalytic factor of 
the LUBAC (linear ubiquitin chain assembly complex)’. While 
structural insights into the RBR E3 ligases Parkin and HHARI in 
their overall auto-inhibited forms are available*!’, no structures 
exist of intact fully active RBR E3 ligases or any of their complexes. 
Thus, the RBR mechanism of action has remained largely 
unknown. Here we present the first structure, to our knowledge, 
of the fully active human HOIP RBR in its transfer complex with 
an E2~ubiquitin conjugate, which elucidates the intricate nature 
of RBR E3 ligases. The active HOIP RBR adopts a conformation 
markedly different from that of auto-inhibited RBRs. HOIP 
RBR binds the E2~ubiquitin conjugate in an elongated fashion, 
with the E2 and E3 catalytic centres ideally aligned for ubiquitin 
transfer, which structurally both requires and enables a HECT- 
like mechanism. In addition, three distinct helix-IBR-fold motifs 
inherent to RBRs form ubiquitin-binding regions that engage 
the activated ubiquitin of the E2~ubiquitin conjugate and, 
surprisingly, an additional regulatory ubiquitin molecule. The 
features uncovered reveal critical states of the HOIP RBR E3 ligase 
cycle, and comparison with Parkin and HHARI suggests a general 
mechanism for RBR E3 ligases. 


RBR E3 ligases are characterized by an extended RING domain 
(RING1) followed by an ‘in-between RING’ (IBR) domain and the cat- 
alytic domain, which is structurally an IBR domain but is commonly 
designated RING2 (Extended Data Fig. 1a, b)®!414 HOIP, one of the 
most studied RBRs, is the key E3 ligase of the linear ubiquitin chain 
assembly complex (LUBAC). It is a prototypical RBR yet contains an 
extended RING2 domain that includes the LDD (linear ubiquitin chain 
determining domain, Extended Data Fig. 1c)°-”"* and is thus denoted 
RING2L. The LDD enables the selective formation of linear ubiqui- 
tin linkages. The HOIP RBR is kept in an auto-inhibited state by the 
HOIP UBA domain, whose sequestration by the LUBAC constituent 
HOIL-1L activates HOIP to trigger, together with SHARPIN, NF-KB 
signalling and other cellular processes” ”!>-°. To obtain the first insight 
into an active RBR in a key catalytic complex, we generated a stable 
E2~ubiquitin conjugate (UbcH5B C85K~ubiquitin)”! and isolated its 
complex with HOIP RBR. The subsequent addition of free ubiquitin 
proved necessary for crystal formation, allowing us to solve the HOIP 
RBR/UbcH5B~ubiquitin transfer complex structure at 3.5 A resolution 
(Fig. 1a; Extended Data Figs 2 and 3). 

The asymmetric unit contains two HOIP RBR molecules interacting 
with two UbcH5B~ubiquitin conjugates and an additional ubiquitin 
or E2~ubiquitin conjugate, arranged in a swapped dimer configu- 
ration (Extended Data Fig. 3a). While this arrangement could have 
functional relevance, analysis of interfaces and biophysical exami- 
nation (Extended Data Fig. 3b-f) indicate a monomeric assembly of 
the HOIP/E2~ubiquitin loading complex (Fig. 1), represented in the 
crystal structure by the RING1-IBR module (residues 699-852) from 
one HOIP molecule and the RING2L (residues 853-1,072) from the 
second HOIP molecule in the asymmetric unit. In this assembly, the 
RINGI1-IBR module forms an elongated arm-like unit (Fig. 2a) that 


Figure 1 | Structure of the HOIP RBR/ 
UbcH5B~ubiquitin transfer complex. 

a, Structure with key elements annotated. 

The RBR RINGI and IBR together with two 
RINGI extension helices (hg, hg2) form an 
arm-like unit (magenta; for domain annotations 
see Extended Data Fig. 1). The RING1-IBR 

and the catalytic RING2L (green) are 

connected by two linker helices (hy, hy2) and 
together engage the E2~ubiquitin conjugate 
(UbcH5B~Ub,ct; orange and cyan, respectively) 
positioning it for ubiquitin transfer onto the RBR 
catalytic cysteine (yellow circle). An allosteric 
ubiquitin molecule (Ub,)., blue) binds to the 
RINGI1-IBR across from the activated ubiquitin. 
Zinc ions are shown as grey spheres; hr denotes 
the RING1 helix. b, Location of the three 
ubiquitin-binding regions (UBRs, dashed lines) 
in the complex. 
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Figure 2 | The HOIP RINGI-IBR coordinates the UbcH5B~ubiquitin 
conjugate in a bipartite manner tailored to a HECT-like mechanism. 

a, Coordination of the UbcH5B~ubiquitin conjugate (orange and cyan) 
by the RING1-IBR (magenta). HOIP RING2L is indicated schematically. 
b, The HOIP RINGI1 coordinates the E2 in a shifted position compared 

to classic RINGs. Overlay of HOIP RING1/UbcH5B (magenta/yellow) 
and RNF4 RING/UbcHSA (brown/orange, PDB: 4AP4 (ref. 21)). c, RBR 
hg and IBR form UBR1 binding the activated ubiquitin. A central salt 
bridge system connects hg, and ubiquitin. d, Comparison of E2~ubiquitin 
binding by HOIP RINGI (top) and a classic RING (bottom, RNF4 PDB: 


together with the RING2L embraces the E2~ubiquitin conjugate in a 
clamp-like manner (Fig. 1a). This active HOIP RBR conformation is 
markedly different from previous structures of auto-inhibited RBRs 
(Extended Data Fig. 1d) and enables an astounding array of features 
inherent to the active RBR. Most notably, three distinct helix-IBR-fold 
motifs function as essential discrete ubiquitin-binding regions (UBR) 
(Fig. 1b). 

The HOIP RING1/E2 interaction is tailored towards a HECT-like 
mechanism, setting it apart from classic RING E3 ligases. While RING/ 
E2 interactions of both classic RING and RBR E3 ligases utilize similar 
surfaces (Extended Data Fig. 4a)*!~*°, the position of the HOIP RING1 
domain relative to the E2 is shifted compared to classic RING/E2 com- 
plexes (Fig. 2b). Therefore, the RBR RING1 and the E2 do not form a 
composite surface to bind the E2-conjugated activated ubiquitin (Ub, 
Extended Data Fig. 4b, c, e), which is key to the mechanism of classic 
RING £3 ligases”!4”, Instead, two extension helices (hg), hg2) link the 
RBR RINGI to the IBR domain (Figs 1 and 2a)*""!", and helix hg» with 
the IBR forms an UBR (UBR1) that engages the activated ubiquitin 
(Fig. 2c and Extended Data Fig. 5a, b). UBR1 binds ubiquitin in a 
distinctive mode (mode 1) that utilizes a salt bridge system involving 
HOIP hg, residues K783 and E787 and ubiquitin residues K11 and E34, 
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4AP4 (ref. 21)) highlights the differences in E2~ubiquitin thioester 
positioning (red spheres). The directionality of the thioester attacking 
residue (active site cysteine of RBR in the HECT-like transfer and lysine 
of the substrate in RING-mediated transfer) is indicated. e, f, Quantitative 
thioester transfer (e) (mean activity + s.e.m. (n = 3); one-way ANOVA 
followed by Tukey’s post-test; **P < 0.01; ***P < 0.001; Supplementary 
Fig. 1) and linear polyubiquitination (f) assays of HOIP RBR wild type 
(WT) and interface mutants, and HOIP catalytic domain (RING2L). 

g. Thioester transfer and linear ubiquitin assays for UbcHS5B wild type 
and RINGI interaction mutant F62A (Coomassie-stained bands in red). 


with further support from the HOIP IBR (Fig. 2c and Extended Data 
Fig. 5b). Thus, in HOIP the entire RING1-IBR arm mediates bipartate 
binding of the E2~ubiquitin conjugate, with RINGI binding the E2 and 
the hg.-IBR module binding activated ubiquitin (Fig. 2a). Sequence 
and structural comparisons with Parkin and HHARI suggest conserva- 
tion of this mechanism among RBR E3 ligases (Extended Data Figs 4c 
and 5c). Importantly, the bipartite binding mode results in an elon- 
gated conformation of the E2~ubiquitin conjugate with its thioester 
linkage not suited for direct attack by the amine function of a substrate 
(Fig. 2d and Extended Data Fig. 4e-f). The consequence is an entirely 
different catalytic arrangement compared to classic RING-supported 
catalysis, as emphasized by the observed lack of effect of mutations 
in UbcH5B L104 and S108, two residues crucial for classic RING/E2 
catalysis?!*>° (Extended Data Fig. 4d). Instead, the E2~ubiquitin 
thioester is ideally positioned for transfer of the activated ubiquitin 
onto HOIP RING2L, thus both enabling and requiring a HECT-like 
mechanism. The importance of each interaction site is demonstrated 
in thioester transfer and polyubiquitination assays, where mutation 
of key RING] and UbcHS5B residues drastically impairs RBR activity 
(Fig. 2e—g). Single mutations in the hg» salt bridge moderately diminish 
activity in thioester assays but dramatically impair polyubiquitination 
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Figure 3 | Mechanism of E2~ubiquitin/HOIP RBR ubiquitin transfer. 
a, UbcH5B~ubiquitin conjugate bound to HOIP RING2L (RING2, 

light green and the linear ubiquitin chain determining domain (LDD) 
extension, dark green). UBR1 (schematic of helix-IBR, magenta) co- 
operates with UBR2 (comprising hy and IBR fold of RING2L) to bind the 
activated ubiquitin. UbcH5B interacts with RING2L in a region designated 
for the acceptor ubiquitin’ (displayed in b). b, Positioning of E2~Ubact or 


activity (Fig. 2e, f), indicating a cumulative effect due to a potential role 
of UBR1 in coordinating Ub, in steps subsequent to its initial trans- 
fer to HOIP. However, removal of the salt bridge in HOIP/ubiquitin 
double mutants and mutation of the HOIP IBR/Ub,,; interface cause 
the expected drastic reduction in thioester transfer activity (Extended 
Data Fig. 5d-f). 

The other portion of the RBR/E2~ubiquitin embrace is centred 
around the catalytic HOIP RING2L (Figs 1a and 3a). Here a helix- 
IBR-fold motif consisting of helix hy, from the IBR-RING2L linker and 
RING2L form a second UBR (UBR2) binding the activated ubiquitin 
(Fig. 3a and Extended Data Fig. 6a—c). UBR2, which is conserved in 
Parkin and HHARI (Extended Data Fig. 6d), uses a hydrophobic pat- 
tern in helix hy2 and RING? to interact with the canonical (144) anda 
second hydrophobic patch”® of ubiquitin (Extended Data Fig. 6b-e). 
These interactions support the engagement of the ubiquitin R72/R74 
di-Arg motif by polar residues, which is the key characteristic of UBR 
binding mode 2. This ultimately places the ubiquitin C terminus onto 
RING2 (Extended Data Fig. 6b, d) and thus the ubiquitin/E2-thioester 
linkage onto the RBR active site. A previous structure’ of the isolated 
HOIP RING2L with two ubiquitin molecules bound in a linear non- 
covalent arrangement mimics the final HOIP RING2L donor/acceptor 
ubiquitin transfer complex (Fig. 3b and Extended Data Fig. 6f). 
Remarkably, in this structure the donor ubiquitin adopts a position 
identical to the activated ubiquitin in HOIP UBR2 despite lacking the 
hy2 interaction (Extended Data Fig. 6f). This indicates that the binding 
mode of the activated ubiquitin observed in the HOIP/E2~ubiquitin 
complex persists from the E2~ubiquitin/E3 HOIP transfer complex 
to the HOIP~ubiquitin/acceptor ubiquitin transfer complex. The E2 
portion of the E2~ubiquitin conjugate instead interacts with a region 
of RING2L that overlaps with that observed for the acceptor ubiquitin 
in the second transfer reaction (Fig. 3b and Extended Data Fig. 6f). 
Thus, the binding of E2~ubiquitin and of acceptor ubiquitin/substrate 
are mutually exclusive, requiring formation of a covalent HECT-like 
RBR~ubiquitin intermediate in the RBR E3 ligase cycle. 

Importantly, the HOIP RBR/E2~ubiquitin complex structure lacks 
the spatial gap between the E2 and E3 catalytic centres that is fre- 
quently observed in HECT/E2 complex structures and that was also 
predicted for RBR/E2~ubiquitin transfer complexes*!15°. Thus, 
except for the ~3.5 A spacer due to the C85 to lysine substitution in 
the E2~ubiquitin conjugate, the HOIP RBR/E2~ubiquitin structure 
accurately depicts the immediate transfer complex. Here the catalytic 
centres of HOIP RING2L and E2 come in close proximity via two 
contact conduits involving all three proteins (Fig. 3c and Extended 
Data Fig. 7a). The first conduit consists of ubiquitin R72, which 
interacts with D983 and Q974 in the 85/6-hairpin of HOIP RING2L. 
Additionally, E976 in this hairpin mediates a salt bridge with ubiquitin 
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HOIP~Ub thioester ( 


WT E976A 


HOIP RING2L Conduit 


donor/acceptor ubiquitin (Ubgon or Ubacc, respectively) onto RING2L. 

c, Ternary HOIP RBR/E2~ubiquitin catalytic transfer complex. Two main 
contact conduits (red dashes) position the RBR catalytic cysteine (C885) 
near the E2 C85-ubiquitin thioester linkage. d, HOIP~ubiquitin thioester 
formation assay with wild-type or E976A RBR, supports the HOIP/ 
UbcH5B/Ub link in conduit 1 (mean activity + s.e.m. (mn = 3); two-tailed 
unpaired Student's t-test; ***P < 0.001; Supplementary Fig. 1). 


R74 and UbcH5B R90, thus facilitating interactions among all three 
proteins. The second conduit consists of catalytic residues of UbcH5B 
(N77, D117) and HOIP (H887, Q896)>%!91421-24,26 These residues 
appear permissive to close proximity between the reaction centres, 
yet not crucial for transesterification because, for example, a H887A 
mutation does not affect the thioester transfer reaction!®!4 (Extended 
Data Fig. 7b). Surprisingly, mutation to alanine of UbcH5B D117, a 
critical residue for classic RING-supported catalysis?!*4, enhances 
transesterification (Extended Data Fig. 7b), further underlining the 
vastly different catalytic mechanism of RBR E3 ligases. This finding 
also points to a trade-off in the E2 active site to support both classic 
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Figure 4 | An allosteric ubiquitin interacts with UBR3 in the RINGI- 
IBR arm and is crucial for HOIP activity. a, An allosteric ubiquitin 
(Ubano, blue) binds to UBR3 across the activated ubiquitin. Left, overview 
depicting UBR3/Ubaio. Top right, UBR3 hg2-IBR/Ubano interaction. 
Bottom right, magnified view of the HOIP ubiquitin di-Arg binding motif 
(E809) anchoring a parallel ubiquitin/IBR 3-sheet. b, Linear di-ubiquitin 
increases HOIP RBR activity. Thioester transfer assays of wild-type HOIP 
and UBR3 mutants (mean activity + s.e.m. (n =3); one-way ANOVA 
followed by Tukey’s post hoc test; **P < 0.01; ***P < 0.001; NS, not 
significant; Supplementary Fig. 1). c, Polyubiquitination assays showing 
release of HOIP UBA-RBR auto-inhibition by HOIL-1L or linear 
di-ubiquitin. d, Effect of wild-type HOIP or UBR3 mutants in NF-kB 
reporter assays using HEK293T cells expressing full-length HOIP with or 
without HOIL-1L (mean activity + s.e.m. of three biological replicates each 
with three technical replicates; one-way ANOVA followed by Tukey’s post 
hoc test; ***P < 0.001; Extended Data Fig. 9f). 
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RING- and HECT-type RBR E3 ligases. Notably, UbcH7, which is 
specialized for HECT-like E3 catalysis’, features a histidine instead 
of D117 (Extended Data Fig. 7a). Mutational analysis demonstrates 
a crucial role for conduit 1 and also indicates that the close proximity 
between the ubiquitin thioester (and thus C85 of UbcH5B) and HOIP 
catalytic cysteine C885 is the driving factor for E2/RBR E3 ubiquitin 
transfer (Fig. 3c, d and Extended Data Fig. 7). Analysis of Parkin and 
HHARI shows conservation of the conduits (Extended Data Fig. 7a). 
However, while Parkin and HHARI lack the $5/6-hairpin indigenous 
to HOIP RING2L, they instead possess a pair of conserved polar 
residues in the RING2 active site loop that are capable of binding the 
di-Arg motif in conduit 1. 

Surprisingly, our structure reveals that an additional allosteric 
ubiquitin molecule (Ub,).) interacts with a third HOIP UBR. UBR3 
is located in the RING1-IBR arm immediately across UBR1 and 
Ubact (Figs 1b, 4a and Extended Data Fig. 8a, b). Ubaijo uses a binding 
mode similar to that of Ubact with UBR2 (mode 2), characterized by 
hydrophobic interactions and a di-arginine binding clamp (Extended 
Data Fig. 8a). Ubano interacts with helix hg of the extended RING] 
and with the IBR, and makes additional interactions with helix hg, 
(Fig. 4a and Extended Data Fig. 8c). Through this binding, UDaio 
induces a ‘straight’ conformation of helix hg: locking RING1 and IBR 
in their relative position, forming UBR1 to accommodate the activated 
ubiquitin (Fig. 4a and Extended Data Fig. 8b-d). Notably, UBR3 in 
the HOIP RBR/UbcH5B~ubiquitin complex binds linear di-ubiquitin 
(Ka=7 4M) better than mono-ubiquitin (Ky > 50|1M) (Extended Data 
Fig. 9a). Pre-incubation of HOIP RBR with linear di-ubiquitin leads 
to improved binding of UbcH5B~ubiquitin (Extended Data Fig. 9b), 
emphasizing the allosteric function of UBR3. Accordingly, HOIP 
UBR3 I807A and E809A mutants show moderately decreased activity 
in thioester transfer assays but more pronounced effects in polyubiquit- 
ination assays, where linear di-ubiquitin/polyubiquitin are intrinsically 
produced (Fig. 4b and Extended Data Fig. 9c). Importantly, N- and 
C-terminally capped linear di-ubiquitin increases HOIP RBR thioester 
transfer activity in a dose-dependent manner, but cannot activate HOIP 
UBR3 mutants (Fig. 4b and Extended Data Fig. 9d). Moreover, the linear 
di-ubiquitin 144A mutant also fails to activate HOIP RBR (Extended 
Data Fig. 9d). 

Excitingly, the interaction of Ub, with UBR3 is structurally similar 
to that recently reported for phospho-ubiquitin in a tethered complex 
with AUBL Pediculus humanus Parkin! (Extended Data Fig. 8c-e). 
Binding of phospho-ubiquitin leads to a straight conformation of 
Parkin helix hg, and an accompanying reorientation of RING] and IBR, 
indicating a general role of UBR3 and ubiquitin in allosteric regulation 
of RBR proteins. Functionally, binding of phospho-ubiquitin activates 
Parkin by counteracting the auto-inhibitory function of the Parkin UBL 
domain (Extended Data Fig. 8c-e)!°. In HOIP, the UBA domain exerts 
intramolecular auto-inhibition®®. While the structure of auto-inhibited 
HOIP is unknown, the structure of auto-inhibited HHARI shows 
binding of its UBA domain to a region analogous to UBR3 (Extended 
Data Fig. 8f-h)'!. To determine if linear di-ubiquitin can overcome 
HOIP auto-inhibition, we examined its effect on HOIP UBA-RBR. 
As expected, HOIP UBA-RBR alone exhibits low E3 activity but is 
activated by HOIL-1L (Fig. 4c). Notably, linear di-ubiquitin can also 
remove HOIP UBA auto-inhibition and at high concentrations allows 
the processive formation of polyubiquitin chains by HOIP UBA-RBR 
(Fig. 4c). Importantly, in HEK293T cells expressing full-length HOIP, 
the HOIP UBR3 I807A and E809A mutants fail to activate NF-KB 
regardless of HOIL-1L expression, demonstrating an essential physio- 
logical role of UBR3 (Fig. 4d). Thus, UBR3 probably serves as a critical 
sensor of ubiquitin chains that regulates LUBAC function. Whether this 
role is tailored to linear ubiquitin chains or ubiquitin chains in general 
(Extended Data Fig. 9b, e) needs further investigation in the context of 
other LUBAC constituents and binding partners. 

The features revealed by the HOIP RBR/E2~UDact/Ubatlo complex 
structure provide the missing links in our understanding of these 
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enigmatic multidomain E3 ligases and yield a mechanistic model 
for the RBR E3 ubiquitin ligase cycle, as summarized in Extended Data 
Fig. 10. Furthermore, the conservation of key mechanistic features in 
HOIP, HHARI, Parkin and other RBRs (Supplementary Data 2) under- 
lines the general nature of the catalytic RBR cycle revealed in this study. 
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METHODS 


No statistical methods were used to predetermine sample size. The experiments 
were not randomized and the investigators were not blinded to allocation during 
experiments and outcome assessment. 

Constructs. Human HOIP and HOIL-1L cDNA were purchased from Open 
Biosystems (cloneIDs 4653017 and 3877587, respectively). HOIP RBR (residues 
696-1,072), HOIP RING2L (residues 853-1,072) and full-length HOIL-1L were 
cloned into the pET-NKI-6xHis-3C-LIC vector®’ coding for an N-terminal 6 x His 
tag with a 3C protease cleavage site. HOIP UBA-RBR (residues 475-1,072) was 
cloned into a pET-NKI-6 x His-eGFP-3C-LIC vector that codes for a 3C-cleavable 
His-tagged enhanced green fluorescent protein (eGFP) followed by the HOIP 
sequence. Human UbcH5B and Cdc34 DNA were a gift from M. Petroski. Coding 
sequences for UbcH13 and Uevla were extracted out of a human cDNA library 
(Agilent Megaman). For crystallization, UbcH5B (residues 2-147) with the muta- 
tions $22R (to prevent backside ubiquitin binding? ') and C85K (to enable cova- 
lent ubiquitin linkage?!) was cloned into the pET-NKI-6 x His-3C-LIC vector. 
UbcHS5B without S22R and C85K mutations (used for enzymatic assays), Cdc34, 
UbcH13 and Uevla were cloned into the same vector. Untagged mono-ubiquitin 
with native N and C termini, used for crystallization and linear ubiquitination 
assays, was cloned into the pET29 vector (Novagen) using NdelI/Xhol restriction 
sites. N-terminally blocked mono-ubiquitin used for thioester assays was cloned 
in the pET-NKI-6 x His-3C-LIC vector. Untagged linear di-ubiquitin was cloned 
with overlap extension PCR and ligated into the pET29 vector (Novagen) using 
Ndel/Xhol restriction sites. N- and C-terminally blocked di-ubiquitin with a 
N-terminal His tag and a C-terminal Ala—Ser sequence was cloned into the pET- 
NKI-6x His-3C-LIC vector. Human ubiquitin-activating enzyme El (Ubel) was 
cloned into a pET28 vector resulting in an N-terminal His tag. For NF-kB assays 
full-length HOIP with an N-terminal Flag tag and HOIL-1L with an N-terminal 
myc tag were cloned into pcDNA3.1(+) (Invitrogen) using EcoRI/Notl restriction 
sites. Mutations in UbcH5B, ubiquitin and HOIP were introduced using standard 
site-directed mutagenesis techniques. 

Protein expression and purification. All proteins were expressed in BL21(DE3) 
E. coli after induction with 0.5mM IPTG overnight at 20°C. For expression of 
HOIP and HOIL-1L constructs, 0.5 mM ZnCl, was added to the cultures before 
induction. Bacteria were harvested by centrifugation, lysed by addition of lysozyme 
and sonication in the presence of protease inhibitors (PMSF and leupeptin) and 
DNase. Lysates were cleared by centrifugation and His-tagged proteins were 
initially purified using Ni-NTA agarose (Qiagen). For HOIP RBR used for crys- 
tallization, and UbcH5B, Cdc34, UbcH13, Uevla, wild-type ubiquitin to gener- 
ate K48-linked di-ubiquitin and HOIL-1L His tags were removed by addition of 
3C protease overnight at 4°C. HOIP RBR and HOIL-1L were further purified 
using Superdex 200 10/300 GL or HiLoad 16/600 Superdex 200 pg size-exclusion 
chromatography columns (GE Healthcare) equilibrated in protein buffer (10 mM 
HEPES pH 7.9, 100 mM NaCl). UbcH5B used for biochemical assays was fur- 
ther purified on a Superdex 75 10/300 GL size-exclusion chromatography column 
(GE Healthcare) equilibrated in protein buffer. HOIP mutants for activity assays, 
and Cdc34, UbcH13 and Uevla were desalted into protein buffer directly after 
Ni-NTA purification using PD MidiTrap G-25 desalting columns (GE Healthcare). 
Ubel for biochemical assays was further purified using ion-exchange chromatogra- 
phy (Source Q) in 10mM HEPES pH 7.9, 10mM NaCl and eluted with a gradient 
from 10-500 mM NaCl. N-terminally His-tagged (di-)ubiquitin was purified using 
Ni-NTA as described above followed by size-exclusion chromatography using a 
Superdex 75 10/300 GL column (GE Healthcare) equilibrated in protein buffer or 
buffer exchange into protein buffer using PD MidiTrap G-25 desalting columns. 
To purify untagged mono- or di-ubiquitin, 0.5 mM EDTA and 100 mM sodium 
acetate pH 4.5 were added to the bacterial lysates and lysates were cleared by cen- 
trifugation, diluted sevenfold with 50 mM sodium acetate pH 4.5 and applied to 
a Source S 10/100 ion exchange column (GE Healthcare) equilibrated in 50mM 
sodium acetate pH 4.5. Ubiquitin was eluted with a 0-500 mM NaCl gradient and 
further purified by size-exclusion chromatography on a Superdex 75 10/300 GL 
column (GE Healthcare) equilibrated in protein buffer. His-eGFP-HOIP was puri- 
fied using size-exclusion chromatography as described for HOIP RBR, followed 
by 3C cleavage and removal of His-eGFP via a second round of size-exclusion 
chromatography. All proteins were generally flash frozen in liquid nitrogen in 
small aliquots and stored at —80°C. 

UbcH5B~ubiquitin linkage. UbcH5B~ubiquitin linkage was performed based 
on published methods”. Briefly, Ubel, UbcH5B(S22R/C85K) and ubiquitin were 
mixed and buffer exchanged into 50 mM Tris pH 10, 150mM NaCl using PD-10 
desalting columns (GE Healthcare). 10 mM MgCl, 5mM ATP and 1mM TCEP 
were added and the protein solution was incubated at 37°C for 16h. The com- 
pleteness of the reaction was monitored using SDS-PAGE and covalently linked 
UbcH5B~ubiquitin was purified from unreacted proteins and Ubel using a 
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Superdex 75 10/300 GL size-exclusion chromatography column (GE Healthcare) 
equilibrated in protein buffer. 

HOIP RBR/UbcH5B~ubiquitin complex formation. HOIP RBR was mixed 
with a 1.3-fold molar excess of UbcH5B~ubiquitin and applied to a Superdex 200 
10/300 GL size-exclusion chromatography column equilibrated in protein buffer. 
Complex formation and purity was confirmed using SDS-PAGE, and complex con- 
taining fractions were pooled and concentrated to ~12 mg ml" for crystallization. 
HOIP/UbcH5B~ubiquitin/ubiquitin crystallization. Crystallization was 
performed using the vapour diffusion technique in sitting drop MRC 96-well 
plates (Molecular Dimensions). Initial crystals were obtained mixing HOIP/ 
UbcH5B~ubiquitin complex solution with an equimolar amount of free ubiq- 
uitin in the Morpheus Screen (Molecular Dimensions). Subsequently, 2 11 of 
the protein complex were mixed with 0.61] reservoir solution (0.1 M Morpheus 
Buffer 3 pH 8.5 (Tris/Bicine), 0.12 M Morpheus Alcohols Mix (0.02 M each of 
1,6-hexanediol; 1-butanol; 1,2-propanediol (racemic); 2-propanol; 1,4-butanediol; 
1,3-propanediol), 30% Morpheus P550MME_P20K mix (20% PEG550MME, 
10% PEG20K) and 8% glycerol) in MRC 48-well plates (Molecular Dimensions). 
Crystals appeared after about one week at 12°C and were cryo-cooled, and eval- 
uated on a rotating anode X-ray generator (Rigaku FR-E superbright). Seeding 
and dehydration of the crystals was performed to improve crystal diffraction. For 
successful dehydration, reservoir was slowly added to the protein drop (3 x 0.541 
within ~2h) and subsequently equilibrated overnight at 12°C against a reser- 
voir solution with increased P550MME_P20K concentration by adding 11 11 60% 
Morpheus P550MME_P20K stock solution to 5011 reservoir solution. The new 
reservoir solution was then slowly added to the protein drop (3 x 0.511, followed 
by 2 x 1 with removal of 111 each in the last steps). After further overnight 
equilibration, crystals were harvested from the drop and directly cryo-cooled 
in a cryogenic nitrogen stream at 100K. Crystals diffracted in-house to 4-6 A. 
Complete diffraction data were measured at 100 K at beamline 23ID-D of the 
General Medical Sciences and Cancer Institutes Structural Biology Facility at the 
Advanced Photon Source (GM/CA @ APS), Argonne National Laboratory. Despite 
their size (common dimensions of ~200 x 140 x 100,1m3) crystals exhibited sub- 
stantial inhomogeneity resulting in split and smeared diffraction spots. Using raster 
scans*’, a suitable region for data collection could be identified at the edge of the 
crystal. Using a small (20,1m diameter) beam, split spots could be separated to 
allow reliable indexing and integration. Utilization of a small beam necessitated 
higher flux to retain reliable diffraction. To mitigate the radiation damage, the 
total dose was distributed over a 100-|1m stretch of the crystal by using the ‘helical 
mode of ‘vector’ data collection as implemented in JBlulce**. Data were measured 
at 1.282 A wavelength with a Pilatus3 6M pixel array detector with a 1-mm-thick 
sensor (Dectris). 

Data processing and structure solution. Data were collected from a single crystal 
and indexed, integrated and scaled in XDS/XSCALE™. Data were further processed 
using AIMLESS” from the CCP4 suite*® with a resolution cut-off of 3.48 A, result- 


ing in an (=) = 1.6and CC1/2=0.648 in the highest resolution shell. Phasing was 
0! 


carried out in Phaser*” using an MR-SAD protocol as implemented in PHENIX™. 
For this, independent molecular replacement searches were initially performed 
for the RING2L domain of HOIP (from PDB: 4LJP (ref. 14)), UbcH5B (from PDB: 
3A33 (ref. 39)), and ubiquitin (from PDB: 4LJP (ref. 14)) with the four C-terminal 
residues deleted. Various ambiguous solutions were identified that could not be 
separated, and Zn?” sites could not be identified using MR-SAD due to incom- 
pleteness of resultant models. However, manual inspection revealed that some MR 
solutions contained ubiquitin oriented near identically to the symmetry-related 
donor ubiquitin observed in the HOIP RING2L/ubiquitin-ubiquitin transfer com- 
plex (PDB: 4LJP (ref. 14)). Based on this observation, a trimmed search model was 
created that contained a complex of the core of HOIP RING2L (with residues 
906-924 and 949-999 removed) and C-terminally truncated ubiquitin. An MR 
search using this model found a single solution for two copies of the complex. After 
successful iterative searches for two UbcH5B molecules and two further ubiquitin 
molecules, MR-SAD using Phaser identified 15 distinct Zn** sites including the 
known Zn’* sites in the RING2L domain of HOIP. Further molecular replacement 
in Phaser using a single unit of the initial HOIP RING2L/UbcH5B~ubiquitin 
complex (without the additional second ubiquitin), and the NMR structure of 
HOIP IBR (zinc atoms removed, deposited in Protein Data Bank? under PDB 
accession number 2CT7, unpublished) correctly placed a single HOIP IBR domain, 
which was then manually copied to the other NCS-related HOIP in the asymmet- 
ric unit. For molecular replacement of the HOIP RINGI, Sculptor"! was used to 
generate a search model based on the structure of the RING1 domain of HHARI 
(PDB: 4KBL (ref. 11)). However, Phaser was not able to correctly place this domain, 
probably owing to the low sequence conservation of only 27% identity. However, 
since mutational analysis of HOIP suggested that the RING/E2 interaction is pre- 
served between RING-type E3 ligases and RBR-type E3 ligases°, we overlaid the 
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E2 of the published RNF4-RING/UbcH5A ~ubiquitin structure (PDB: 4AP4 (ref. 
21)) with the E2 in our structure and then used this overlay to add the RING1 
model generated by Sculptor. This overlay placed the HOIP RINGI Zn?*- 
coordinating residues near the last remaining free Zn** ions found earlier by 
Phaser MR-SAD, indicating correct placement of the RING1 domain. In the final 
round of molecular replacement, the two additional ubiquitin (Ub, 1.) molecules 
were reinstated at the RING1-IBR interface. At this stage, Refmac’? was used for 
refinement using settings optimized for low-resolution refinement® including 
‘jelly body refinement’ and Babinet scaling. PrOSMART“ was used to generate 
external restraints against high-resolution structures (PDB: 4LJO (ref. 14) for HOIP 
RING2L and ubiquitin, and PDB: 2ESK (ref. 45) for UbcH5B). After this, clear 
extra electron density became visible for the unmodelled helical linker regions of 
the RINGI-IBR and IBR-RING2L transitions and for other regions omitted in the 
initial search models. Further model building and refinement was manually per- 
formed in Coot* and Refmac. During refinement additional clear positive differ- 
ence map electron density became visible and Phaser was used to place one 
additional UbcH5B molecule (UbcH5B,qq) into this density. TLS restraints were 
generated using the TLSMD server’ and NCS restraints were used throughout 
refinement. One overall B-factor was refined in Refmac. In later rounds of refine- 
ment the PDB_REDO server*® was used for refinement optimization and 
MolProbity”” was used for structure validation. Data processing and refinement 
statistics are summarized in Extended Data Fig. 2b. Ramachandran statistics were 
calculated using MolProbity and 94.8% of all residues are in favoured regions, 4.9% 
in allowed regions and 0.3% are outliers. The final structure has a MolProbity score 
of 1.75 (100th percentile). In the final structure the two HOIP RBR molecules (see 
also Extended Data Fig. 3) are defined by electron density from residues 699 to 
707,711 to 948, 969 to 991, and 996 to 1,011 (chain A) and 699 to 754, 760 to 957, 
967 to 1,015, 1,019 to 1,035 and 1,054 to 1,066 (chain B). The catalytic 
UbcH5B~ubiquitin conjugates are defined from UbcHSB residues 3 to 147 and 
ubiquitin residues 1 to 76 (chains C and E), and UbcHS5B residues 2 to 147 and 
ubiquitin residues 1 to 76 (chains D and F). The allosteric ubiquitin chains (chains 
Gand H) are defined from residues 1 to 76 and the additional UbcH5B (chain I) 
is defined from residues 2 to 146. PHENIX was used to calculate simulated anneal- 
ing (SA) composite omit maps and feature enhanced maps (FEM). All molecular 
figures were prepared in PyMOL (Schrédinger, LLC). 

K48-linked and K63-linked ubiquitin chain formation. K48-linked and K63- 
linked ubiquitin chains were formed through a linkage-specific enzymatic reaction 
using Cdc34 and UbcH13/Uevla E2 ubiquitin-conjugating enzymes, respectively, 
as described in the literature’. Ubiquitin chains were separated using ion-exchange 
chromatography as described above for purification of mono-ubiquitin. Purified 
K48-linked di-ubiquitin was directly desalted into protein buffer using PD-10 
desalting columns, whereas K63-linked di-ubiquitin was further purified on a 
Superdex 75 10/300 GL size-exclusion chromatography column equilibrated in 
protein buffer. Native ubiquitin without additional residues was used to generate 
di-ubiquitin chains for ITC experiments, whereas N-terminally blocked ubiquitin 
was used to form K48-linked di-ubiquitin for testing allosteric activation of HOIP 
RBR. 

Linear polyubiquitination assay. Linear ubiquitin formation assays were per- 
formed in 50mM HEPES pH 7.9, 100mM NaCl, 10mM MgCl, and 0.6mM DTT 
using 200nM E1, 11M UbcH5B, 11M HOIP RBR or HOIP RING2L and 40,1.M 
untagged ubiquitin. Reactions were started by addition of 10 mM ATP and were 
incubated at 30°C for 2h. Samples were taken at the indicated time points and 
treated with 50mM sodium acetate pH 4.5 as described previously®, mixed with 
SDS sample buffer and analysed by SDS-PAGE using 12% Bolt Bis-Tris gels (Life 
Technologies). Proteins were visualized with Coomassie Brilliant blue dye. To test 
the activating effect of linear di-ubiquitin on auto-inhibited HOIP UBA-RBR, 
51M HOIP UBA-RBR was pre-incubated with N- and C-terminally blocked linear 
di-ubiquitin or HOIL-1L at the indicated concentrations before addition of the 
remaining assay components. Samples were taken after 60 min and subsequently 
treated as described above. 

UbcH5B~ ubiquitin to HOIP RBR ubiquitin transfer assay. To monitor 
HOIP~ubiquitin thioester ubiquitin transfer from UbcH5B to HOIP, Ubel 
(100 nM), UbcH5B (411M) and N-terminally blocked ubiquitin (32 1M) were 
mixed in 50mM HEPES pH 7.9, 100mM NaCl, 10mM MgCl, and 5mM ATP and 
incubated at 25°C for 5 min when 21M HOIP RBR was added. Samples were taken 
10s after HOIP addition, quenched by addition of pre-heated SDS protein-loading 
buffer without DTT, and run on a 12% SDS-PAGE gel (Life Technologies). The 
10-s time point used was empirically determined with a time-course experiment 
(Extended Data Fig. 9g). Gels were stained with Coomassie Brilliant blue dye and 
scanned on a Li-COR Odyssey scanner using the 700 nm (red) channel. For the 
thioester transfer assay shown in Fig. 3d, 200nM Ubel, 211M UbcHS5B, 8,1M HOIP 
RBR, 8,1M N-terminally blocked ubiquitin and 10mM ATP were used and samples 


taken after 30s. Furthermore, proteins were transferred to a PVDF membrane 
and ubiquitin was visualized on a LI-COR Odyssey scanner at 800 nm using an 
anti-ubiquitin antibody (P4D1, Santa Cruz, 1:200 dilution in TBST (50 mM Tris 
pH 7.4, 150mM NaCl, 0.05% Tween-20)) followed by an IRDye 800CW second- 
ary antibody (LI-COR, 1:10,000 dilution in TBST). All quantitative experiments 
shown in graphs were performed in triplicates and band intensities were quanti- 
fied using the ImageStudio software (LI-COR). HOIP thioester transfer activity 
was calculated as the fraction of HOIP~ubiquitin to total HOIP for each mutant 
and normalized against thioester transfer activity of wild-type HOIP. Data were 
analysed in GraphPad Prism using two-tailed unpaired Student's t-test or one-way 
ANOVA followed by Tukey’s post hoc test. 

Allosteric activation of HOIP RBR by di-ubiquitin. To test the allosteric activa- 
tion of HOIP RBR by linear di-ubiquitin, a modified ubiquitin transfer assay was 
performed. HOIP RBR was pre-incubated with N- and C-terminally blocked linear 
di-ubiquitin at the indicated final concentrations for 5 min at 25°C. At the same 
time, Ubel, UbcH5B, ubiquitin and ATP were premixed and incubated for 5 min 
at 25°C, resulting in fully loaded UbcH5B~ubiquitin. Both mixtures were subse- 
quently mixed together, resulting in final concentrations of 100nM Ubel, 24M 
UbcHS5B, 8M N-terminally blocked ubiquitin and 21M HOIP RBR in the final 
buffer described for the standard ubiquitin transfer assay. Samples were taken after 
30s and further treated as described for the standard transfer assay. A 30-s time 
point was determined to give the best results in this assay, in which lower E2 and 
mono-ubiquitin concentrations were used, resulting in an overall slower reaction 
rate. The experiments comparing the effects of linear versus K48-linked di-ubiquitin 
(Extended Data Fig. 9e) were performed similarly, with the difference that all sam- 
ples were incubated with apyrase (Sigma) for 5 min to deplete ATP before addition 
of HOIP/di-ubiquitin and prevent E2-loading of K48-linked di-ubiquitin, which 
features a free C terminus on one of the ubiquitin units. 

Analytical ultracentrifugation (AUC). Sedimentation equilibrium experiments 
were performed in a ProteomeLab XL-I (Beckman Coulter) analytical ultracentri- 
fuge. HOIP RBR/UbcH5B~ubiquitin as used for crystallization was loaded into 
a 6-channel equilibrium cell at 5.0, 2.5 and 1.25 1M concentration and centri- 
fuged at 10,000 r.p.m., 20°C in an An-50 Ti 8-place rotor until equilibrium was 
achieved. Data were analysed using HeteroAnalysis software (J. L. Cole and J. W. 
Lary, University of Connecticut; http://www.biotech.uconn.edu/auf/). 
Isothermal titration calorimetry (ITC). ITC experiments were performed 
on an ITC200 calorimeter (Microcal). Aliquots (211 each) of 500-650 1M 
UbcH5B~ubiquitin or di-ubiquitin solution were injected into the cell contain- 
ing 40-50\1M HOIP RBR or HOIP RBR/di-ubiquitin complexes. The experiments 
were performed at 23°C in buffer containing 10 mM HEPES pH 7.9, 100mM NaCl. 
For titrations of UbcH5B~ubiquitin into HOIP RBR/di-ubiquitin complexes, 
HOIP RBR was pre-incubated with an equimolar amount of di-ubiquitin before 
the ITC experiments. Data were analysed using the Origin software (Microcal). 
NF-«B luciferase assay. Human embryonic kidney (HEK) 293T cells (ATCC) 
were co-transfected with NF-«B-luc reporter plasmid that contains an NF-kB 
response element upstream of the promoter driving the luciferase reporter gene, 
pGL4.74[hRluc/TK] control vector (Promega) and epitope tagged Flag-HOIP 
or myc-HOIL-1L pcDNA3.1(+) plasmids in 6-well plates in triplicates using 
Lipofectamine 2000 transfection reagent. Since this assay could be carried out 
in a variety of cellular contexts, HEK293T cells were used because they are easy 
to transfect and suitable for the assay. The cells tested negative for mycoplasma 
contamination. Empty pcDNA3.1(+) vector was used as control. After 36h, cells 
were lysed and 20 1] cell lysates were used to measure firefly luciferase and Renilla 
luciferase (transfection control) signals using the dual luciferase reporter assay 
system according to the manufacturer's protocol (Promega). Data were analysed 
in GraphPad Prism and one-way ANOVA followed by Tukey’s post hoc tests were 
used for statistical analysis. Immunoblotting was performed with anti-Flag (clone 
M2, Sigma-Aldrich) and anti-myc (clone 9E10, Sigma-Aldrich) antibodies, to 
confirm equivalent wild-type and mutant protein expression levels. 
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Extended Data Figure 1 | HOIP domain organization and 
nomenclature. a, Domain organization of HOIP as commonly outlined 
in the literature. HOIP consists of a PNGase/ubiquitin-associated (PUB) 
domain followed by a B-box zinc-finger (B-box) domain®!, NPL4 zinc- 
fingers (NZF), the auto-inhibitory UBA domain and the RING-between- 
RING module (RBR, grey background). The HOIP RBR module contains 
the typical RING1, in-between RING (IBR) and RING2 domains, and 

a HOIP-specific additional linear ubiquitin chain determining domain 
(LDD). A yellow circle indicates the RBR catalytic cysteine (C885) forming 
the HECT-like thioester intermediate with ubiquitin. The binding sites of 
the other LUBAC constituents HOIL-1L and SHARPIN are also indicated. 
b, The RBR RING2 domain has the topology of an IBR domain. The 
individual HOIP RINGI, IBR and RING2 domains from the HOIP RBR/ 
E2~Ub/Ub structure are shown to enable direct comparison of their 
folds. This illustrates that the zinc-finger domain designated RING2 

in fact adopts the topology of an IBR, as multiple groups have reported 
for various RBR E3 ligases previously*!!!4°?, The terms RBR and 
RING2 however are used in this study for consistency with the widely 
accepted nomenclature. c, The HOIP RING2-LDD region. HOIP features 


(auto-inhibited) 


(auto-inhibited) 


an extension of its catalytic RING2 domain termed LDD, which adds 

two zinc-fingers and a helical arrangement to the RING2. The LDD is 
usually denoted as a domain following RING2. However, Rittinger and 
colleagues’ showed that the LDD is intertwined with the HOIP RING2 
to form a single extended domain that contains a central canonical 

RBR RING? with the additional features of the LDD ensuring the linear 
ubiquitin chain formation characteristic of HOIP. This domain will thus 
be designated RING2L (for RING2-LDD). The RING2L from the current 
HOIP structure is displayed with RING2 in light green and LDD in dark 
green. d, The structural arrangement of active HOIP RBR in the HOIP/ 
E2~ubiquitin complex is markedly different from that of auto-inhibited 
RBRs. Left, active HOIP RBR from the HOIP/E2~ubiquitin complex. 
The RING1-IBR region and the RING2L are coloured magenta and green 
respectively. The individual RBR domains are also highlighted: RING1, 
yellow circle; IBR, orange circle; RING2, red circle. The RING] extension 
helices (hg), hg) and IBR-RING2 linker helices (hy; and hy) are labelled. 
Middle and right, analogous representations of auto-inhibited Parkin and 
HHARI (PDB: 4I1H (ref. 8) and PDB: 4KBL (ref. 11)). Additional domains 
and regions besides the RBR of Parkin and HHARI are coloured grey. 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


H729 


RING1/UbcH5B 
interface 


Conduit 2 
G76 
1732 


a 


17N77 


v701 


UBR1 
IBR/Ub, 


UBR2 


Conduit 1 


0 
G76 R90 K85 


Data collection and refinement statistics 


HOIP RBR/UbcH5B~Ub/Ub 
Data collection 


Space group P2, 
Cell dimensions 
a, b, c (A) 104.82, 75.74, 120.96 
a, B,y (°) 90, 95.56, 90 
Resolution (A) 29.69-3.48 (3.72-3.48)* 
Rvaige 0.093 (0.875) 
<Ilol> 9.9 (1.6) 
ccif2 0.995 (0.648) 
Completeness (%) 98.6 (96.1) 
Redundancy 6.5 (6.2) 
Refinement 
Resolution (A) 29.69-3.48 
No. reflections 22930 
Rua Reve 24.87 / 30.26 
No. atoms 
Protein 10953 
Ligand/ion 16 
Water 0 
B-factors 
Protein 185.8 
Ligand/ion 156.6 
Water --- 


R.m.s deviations 
Bond lengths (A) 0.009 
Bond angles (°) 1.465 


Data was collected from a single crystal- 

*Highest resolution shell is shown in parenthesis. 
Extended Data Figure 2 | Quality of crystallographic data and electron HOIP/UbcH5B~Ub/Ub complex contoured at 1c. Proteins are shown in 
density maps. a, Final 2F, — F, (left) and simulated annealing (SA) sticks and coloured according to Fig. 1. b, Data collection and refinement 
composite omit (right) electron density maps of select interfaces of the statistics. 
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Extended Data Figure 3 | See next page for caption. 


© 2016 Macmillan Publishers Limited. All rights reserved 


Extended Data Figure 3 | Complexity of the crystallographic 
asymmetric unit and structure of the HOIP/UbcH5B~ubiquitin/ 
ubiquitin E2-E3 transfer complex. a, The asymmetric unit contains 

two transfer complexes. Left, colour schematic of the proteins present 

in the asymmetric unit. Middle, structure of the asymmetric unit. The 
asymmetric unit contains two HOIP RBR/UbcH5B~ubiquitin complex 
arrangements (complex 1, 2). The two UbcH5B~ubiquitin conjugates are 
coloured orange and cyan, respectively, and are bound to two HOIP RBR 
molecules (magenta and green), which cross over between the complexes. 
Additional allosteric ubiquitin (Ub, ., blue) and UbcH5B~ubiquitin 
(UbcH5B~Ubatt yellow and blue, respectively) molecules are bound to 
the HOIP RBRs in complex 1 and complex 2, respectively. Since Ubatto 
makes all contacts with the RBRs and the UbcH5B of the UbcH5B~Ubaio 
conjugate solely mediates crystal contacts (bottom left), only Ubi, of 
complex 1 is displayed and discussed in the text and figures in terms of 
the additional ubiquitin binding. The black oval indicates an additional 
HOIP hyj/Ubact inter-complex interaction discussed in panels e and f. 
Right, close-up of the region where the two RBRs of HOIP cross over 
between the complexes. The close-up shows that residues D852 and P853 
of the respective RBRs come in 6 A proximity suggesting a continuity 

in the biological complex in which the two residues from the respective 
RBRs are linked (as indicated by the grey background), resulting in the 
monomeric complex schematically illustrated underneath and discussed 
in panels c-f. b, The RING1-IBR and RING2L form two distinct entities 
to bind E2~ubiquitin. The monomeric complex as displayed for complex 
1 assumes a flexible linkage between the autonomous units of the RING1- 
IBR arm and the RING2L from the two different RBR molecules in the 
asymmetric unit. This linkage is formed by residues D852 and P853 
connecting IBR and RING2L (schematically illustrated in the cartoon). 
To test the structural integrity of the assumed link and the autonomy of 
the RING1-IBR arm on one side and the RING2L on the other side, we 
introduced a spacer comprising five alanine residues between D852 and 
P853 in HOIP RBR (see cartoon) and measured the activity of the RBR 
D852-Alas-P853 insertion mutant in polyubiquitination assays. The 
assays show that the mutant (right) retains an activity similar to the wild- 
type RBR (left) indicating that indeed RING1-IBR and RING2L act as 
autonomous units. The dramatically reduced activity of HOIP RING2L 
alone (residues P853 to end) is also shown for reference (middle). c, The 
HOIP RBR/UbcH5B~ubiquitin complex is monomeric at concentrations 
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of 1.25-5 1M. To determine if the HOIP RBR/UbcH5B~ubiquitin 
complex is indeed monomeric in solution, we analysed the isolated HOIP 
RBR/UbcH5B~ubiquitin complex protein material that was used for 
crystallization by sedimentation equilibrium analytical ultracentrifugation 
(SE-AUC). SE-AUC provides an absolute, shape-independent 
measurement of molecular weight, thus allowing accurate determination 
of the oligomeric state. The three SE-AUC experiments performed on the 
HOIP RBR/UbcH5B~ubiquitin complex yielded an absolute molecular 
weight (MW) of 71,658 Da, indicating a monomeric complex. At an 

order of magnitude higher concentrations (12.5-50 1M), SE-AUC results 
indicate the formation of a dimer with a MW of ~144kDa, although curve 
fitting residuals also show substantial presence of aggregates (data not 
shown). These results indicate that the biological complex in solution is 
monomeric at physiological lowj1M concentrations such as those used for 
the thioester transfer assays and polyubiquitination assays. However, the 
dimeric arrangement observed in the crystal structure might be relevant 
in a high concentration setting such as within the LUBAC complex. Here, 
a high local concentration of HOIP RBR could favour binding of the 
E2~ubiquitin between the RINGI-IBR and RING2L of two neighbouring 
molecules. Importantly, all mechanisms depicted in this article hold true 
for both the monomeric and dimeric states (as illustrated in d). This 
means that the deduced mechanism is in principle applicable to different 
RBR E3 ligases of which some might function as dimers in local high 
concentration assemblies (such as within the LUBAC), whereas others 
might be active in a monomeric setting. d, Schematic illustration of the 
dimeric arrangement as observed in the asymmetric unit. The schematic 
shows that all features deduced (Figs 1-4 and Extended Data Figs 4-10) 
are also valid for the dimeric case (binding of Ubaio is omitted for clarity). 
e, Asymmetric unit dimer-related interactions between HOIP hy; and 
Ubact. The dimeric arrangement contains no additional protein-protein 
interfaces compared to the monomeric assemblies with the exception of 
hy; residues W847, M850 and N851, which in the asymmetric unit contact 
the activated ubiquitin of the other complex (indicated by an oval in a). 

f, Mutational analysis of HOIP hy;/Ub,c interactions. Mutations of HOIP 
hy, residues that interact with Ub,,; have no effect on thioester transfer 
activity (Coomassie-stained bands in red), indicating that this ‘trans’ 
complex interaction is not critical for the RBR mechanism, in line with the 
model of a monomeric arrangement. 
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Extended Data Figure 4 | See next page for caption. 
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Extended Data Figure 4 | The HOIP RING1-IBR applies an altered 
binding mode compared to classic RING E3s necessitating a HECT- 
like mechanism. a, The HOIP RBR RINGI uses an E2 interaction 
pattern similar to classic RINGs, but which results in a shifted binding. 
Shown are the details of the RING/E2 interaction in the HOIP 

RING 1ggr/UbcH5B~ubiquitin complex (left), the RNF4 RING dassic/ 
UbcH5A~ubiquitin complex (middle; PDB: 4AP4 (ref. 21)), and the 
BIRC7 RING gassic/ UbcCH5B~ubiquitin complex (right; PDB: 4AUQ (ref. 24)). 
The HOIP RBR-type RING1 uses a pattern of hydrophobic residues as 
the core of the interaction with E2 that is similar to that in classic RING 
E3 ligases. Subtle differences however support a shifted binding mode 
(see also Fig. 2b). The main features of the RING and E2 as well as HOIP 
residues mutated in Fig. 2e—g are displayed in bold. Zinc-finger (ZF) 

1 and 2 of the RING domains and the SPA-loop of the E2 containing a 
conserved Ser-Pro-Ala motif are annotated. For the following panels the 
same structures and colour codes as in a are used. b, The shift in binding 
and altered surface residues in HOIP RING1 do not support the composite 
RING/E2 binding site for activated ubiquitin used by classic RING/E2 
complexes. UbcH5A/B E2s are rendered as surface representation and 
the RING domains in ribbon representation. Residues crucial for classic 
RING £3s to recruit the activated ubiquitin in the composite RING/ 

E2 ubiquitin-binding surface”’** are depicted (middle, right). In HOIP 
RINGI (left) equivalent residues are not conserved (displayed), indicating 
that HOIP cannot accommodate the activated ubiquitin in its RING1/ 
UbcH5B complex. For illustration purposes, only one monomer of the 
dimeric RNF4 is shown (although Y193 from the other RING molecule 

is still displayed)*!; for BIRC7 the RING dimer (RING;) is displayed™. 

c, Alignment of HOIP, Parkin and HHARI RING1 domains with classic 
RING domains centred around residues displayed in b. Residues crucial 
for ubiquitin binding in classic RINGs are highlighted in cyan and their 
structural equivalents in the RING1 domains of the RBR E3 ligases HOIP, 
Parkin and HHARI are indicated by boxes, attesting to the absence of 

a composite RING1/E2 ubiquitin-binding site in RBR ligases (asterisk 
indicates ubiquitin interaction residues from the other RING molecule 

in the dimers formed by the classic RINGs RNF4 and BIRC7). The 
T/I-C-R sequence observed in the dimeric RINGs of RNF4 and BIRC7 
represents the highly conserved @-x-R/K motif, where ® is a hydrophobic 
residue and x is either a Cys in RING E3 ligases or a polar residue in 
U-box ligases”®. This motif is not only critical for E3-mediated catalysis 
by dimeric RING ligases (such as RNF4 or BIRC7) but is also necessary 
for E2-mediated catalysis by simpler monomeric RING and U-Box E3 
ligases*®. The fact that this motif is not conserved in HOIP, HHARI 

and Parkin further confirms the mechanistic differences between RBRs 
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and classic RING domains. d. Thioester transfer assays show that E2 
residues critical for classic RING-supported catalysis are not important 
for HOIP catalysis. Left, HOIP RBR thioester transfer assays show similar 
activity of wild-type UbcH5B and L104A and $108A mutants. This is 

in stark contrast to the reported effects of these mutations on classic 
RING-supported catalysis””*, underlying the fundamentally different 
mechanism of the HECT-like catalysis by HOIP RBR. Right, mutation to 
Ala of HOIP RBR N909, which would be in the vicinity of the activated 
ubiquitin if the E2~ubiquitin conjugate were bound in a bent manner 
(see e), also shows no effect on HOIP thioester formation (Coomassie- 
stained bands in red). e, The altered E2~ubiquitin binding mode of RBR 
RINGI results in the requirement for a HECT-like mechanism. Displayed 
are the entire RING/UbcH5~ubiquitin complexes with RINGs and E2 s 
depicted in ribbon representation and the activated ubiquitin in surface 
representation. The bipartite binding mode used by the HOIP RING1- 
hgo-IBR arm (see also Fig. 2a) results in an elongated E2~ubiquitin 
conformation (left, only the RING1 domain of HOIP is depicted) while 
formation of a composite RING/E2 binding surface in the case of classic 
RING E3 ligases (middle, right) results in binding of the activated 
ubiquitin in a compact manner with a bent E2~ubiquitin conformation. 
Importantly, this bent conformation places the thioester link in a specific 
position relative to the catalytic machinery of the E2, allowing direct 
attack by the lysine/amine function of a substrate or growing ubiquitin 
chain. The Lys85/Ser85 residues mediating the E2~ubiquitin linkage and 
mimicking UbcH5A/B catalytic cysteine C85 are displayed as red spheres. 
In the elongated E2~ubiquitin conformation propagated by the HOIP 
RBR, this attack is not possible. The linkage is however ideally positioned 
for the attack by the RBR catalytic cysteine in a HECT-like mechanism (see 
also Fig. 3 and Extended Data Fig. 7a). f, Close-up of the catalytic centres 
in E2~ubiquitin linkages. Details of the catalytic centres resulting from the 
E2~ubiquitin conjugate conformations outlined in e and, for comparison, 
the HECT-type E3 NEDD4L/UbcH5B~Ub structure (PDB: 3JW0 (ref. 28)), 
with the directionality of an attacking amine indicated as previously 
proposed?!. In the HECT-like RBR arrangement, the UbcH5B~ubiquitin 
linkage is not aligned correctly relative to the E2 catalytic machinery for 

a direct attack by an amine function. This is similar in the HECT-type 
arrangement in the NEDD4L complex but completely different from the 
arrangement in classic RING-supported E2 catalysis. Additionally, the 
ubiquitin C-terminal residues G75-G76 reside in a position that would 
overlap with the attacking amine. The available structure of the BIRC7/ 
UbcH5B~Ub complex (PDB: 4AUQ (ref. 24)) features an UbcH5B N77A 
mutant and the remainder of the Asn side-chain has been manually added 
based on wild-type UbcH5B from PDB: 2ESK (ref. 45) (right). 
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Extended Data Figure 5 | Helix-IBR-fold motifs constitute new 
ubiquitin-binding regions (UBR) in active RBR proteins: binding of 
the activated ubiquitin by UBR1 using binding mode 1. a, Schematic 
illustrating binding mode 1 used by hg2-IBR to bind the activated 
ubiquitin in the RING1-IBR arm. The general principle of this binding 
mode is that the RING1 extension helix 2 (hg2) preceding the IBR presents 
a pattern of charged/polar residues (indicated by blue and red squares, 
which symbolize K, R, H and E, Q, N residues respectively) that interact 
with ubiquitin E34 and K11. These interactions are supported by the IBR 
surface, with a particular contribution of hydrophobic residues (yellow 
square) flanking the salt bridge system. b, Coordination of the activated 
ubiquitin by HOIP hg2-IBR in mode 1. HOIP hg» residues K783 and E787 
bind ubiquitin residues E34 and K11 and are flanked by hydrophobic 
residues M791 and W798 from HOIP IBR. ¢, Structurally equivalent 
residues in Parkin and HHARI. Displayed are the hg>-IBR modules from 
auto-inhibited Parkin (PDB: 5C1Z (ref. 12)) and HHARI (PDB: 4KBL 
(ref. 11)) with residues equivalent to HOIP residues in b depicted, 
illustrating the general conservation of UBR1. It should be noted that 
these structures feature auto-inhibited forms of the RBR proteins, which 
exhibit a kink in hg) of UBR1. This kink would sterically hinder ubiquitin 
binding to UBR1 and probably participates in the RBR auto-inhibition 


mechanism (see also Extended Data Fig. 8). d, Thioester-transfer assays of 
ubiquitin and UBR1 salt bridge mutants. In agreement with the observed 
four-residue salt bridge system in b, the single K11A or E34A ubiquitin 
mutations show only a slight to moderate effect since the remaining 
charged residue can still coordinate the two oppositely charged residues 
of HOIP. In contrast, elimination of both similarly charged residues in 
the complex by combining the HOIP K783A and ubiquitin K11A or 
HOIP E787A and ubiquitin E34A mutations results in a more dramatic 
loss of activity (mean activity + s.e.m. (n =3), one-way ANOVA followed 
by Tukey’s post hoc test; **P < 0.01; ***P < 0.001; NS, not significant; 
representative gels shown in Supplementary Fig. 1). e, f, Role of the 

IBR in UBR1. Close-up of the additional IBR/Ub,¢ interactions in stick 
representation (e) shows that HOIP S803 and ubiquitin K6 coordinate 
the backbone carbonyl functions of ubiquitin T12 and HOIP A800/K829, 
respectively. W798, which is involved in hydrophobic interactions, is also 
displayed in sphere representation. Quantitative thioester transfer assays 
(f) show that alanine mutants of residues outlined in e cause a significant 
loss of activity (mean activity + s.e.m. (n= 3), left: one-way ANOVA 
followed by Tukey’s post hoc test, right: two-tailed unpaired Student’s 
t-test; **P < 0.01; ***P < 0.001; representative gels shown in 
Supplementary Fig. 1). 
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Extended Data Figure 6 | See next page for caption. 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


Extended Data Figure 6 | Binding of the activated ubiquitin by UBR2 
using binding mode 2 and exclusive binding of E2 and acceptor 
ubiquitin. a, Schematic illustrating binding mode 2 used by a helix-IBR- 
fold motif (hy2-RING2) to bind the activated ubiquitin and position the 
thioester linkage for the transfer reaction. The second helix (hy2) of the 
linker between the IBR domain and the catalytic RING2 domain uses a 
pattern of two or three hydrophobic residues (yellow squares) to interact 
with the ubiquitin canonical hydrophobic patch surrounding 144 (ref. 28; 
not shown). Hydrophobic residues of the RING2 IBR-fold complete the 
hydrophobic interaction network by coordinating residues L71 and L73 
in the second hydrophobic patch”® of ubiquitin (not shown). The central 
hallmark of this binding mode is the coordination of the characteristic 
di-Arg (R72, R74) motif in the ubiquitin C terminus, resulting in a 

firm placement of the C terminus on ZF1 of RING2. b, Structure of the 
interaction of the helix-IBR-fold in HOIP hy,-RING2 (UBR2) with 

the activated ubiquitin. Left, hy, residues L860, Y863 and L864 (yellow 
spheres) interact with ubiquitin residues L8, 144 and V70 (not shown). 
Additionally, hydrophobic residues F876 and Y878 (yellow spheres) 

from the IBR-fold of the minimal catalytic RING2 (light green, see also 
Extended Data Fig. 1c) coordinate ubiquitin residues L8, L71 and L73 
(not shown). Right, display of the full HOIP RING2 including the LDD 
insertion (RING2L). The coordination of the ubiquitin di-Arg motif is 
achieved by HOIP residues D983 and E976 from the LDD insertion that 
is part of the catalytic HOIP RING2L (dark green). This results in the 
placement of the E2~Ub thioester linkage (K85 replacing UbcH5B C85 is 
shown as orange sticks) in the vicinity of the catalytic HOIP C885. Bottom, 
alternatively oriented views of the interaction. c, The two hydrophobic 
patches of ubiquitin engaged by UBR2. The hydrophobic residues of 
HOIP RING? interacting with ubiquitin as highlighted in b are shown 

as yellow sticks. The interaction residues on the canonical hydrophobic 
patch of ubiquitin (L8, 144, V70) and the second hydrophobic patch 
(L71, L73)?* are displayed as grey spheres. d, The RING2 domains of 
Parkin and HHARI also contain a helix—IBR-fold (hy,2-RING2) module 
with patterns of residues consistent with the formation of a UBR2. Left, 
helical predictions for the region preceding the RING2 domains of HOIP, 
Parkin and HHARI. The structures of Parkin and HHARI in their auto- 
inhibited forms do not display a helix equivalent to hy2 because this 
region is either not defined (in the crystal structures of HHARI and most 
Parkin structures) or adopts an extended conformation (in two other 
Parkin structures)*"'!, However, a helical prediction reliability score 
(with 1 lowest to 9 highest score) calculated using JPred4** shows a strong 
helical probability for the segment of Parkin and HHARI preceding the 
RING2 domain. In fact, the score is similar to that of HOIP, which is 
displayed with the observed helical secondary structure, pointing to the 
presence of an equivalent of hy» in active forms of Parkin and HHARI. 


These RBR E3 ligases also contain residues capable of interacting with the 
hydrophobic patch in ubiquitin in positions equivalent to HOIP Y863 and 
L864 (highlighted in yellow). Right, structures of the RING2 domains of 
PARKIN (PDB: 411H (ref. 8)) and HHARI (PDB: 4KBL (ref. 11)) showing 
hydrophobic residues (yellow sphere representation) in structurally 
equivalent positions to HOIP F876 and Y878 and residues (labelled red) 
capable of interacting with the di-Arg motif in their catalytic RING2. 
Helix hy. with the conserved hydrophobic residues not present in the 
crystal structures as discussed above is indicated schematically. Bottom, 
different orientations with the putative placement of the ubiquitin C 
terminus indicated schematically. e, The effect of UBR2 alanine mutations 
in thioester transfer assays increases with their proximity to the di-Arg 
motif. Left, mutation of HOIP UBR2 hydrophobic residues to alanine. The 
hy2 L860A and L864A mutations show little effect on activity, while the 
Y863A mutation and particularly the RING2L F876 and Y878 mutations, 
which reside proximal to the di-Arg binding motif formed by D983 and 
E976 (see also Extended Data Fig. 7) show a marked reduction in activity. 
Middle, right, mutation of complementary ubiquitin residues involved in 
UBR2 binding. Similarly to the HOIP mutations, the ubiquitin 144A, L71A 
and L73A mutations show increasing effects with a closer location to the 
di-Arg motif. Furthermore, the ubiquitin R74A mutation shows a strong 
effect on activity, emphasizing the importance of its interaction with HOIP 
and of the resulting placement of the ubiquitin C terminus linked to the 
E2. The ubiquitin R72A mutant failed to form an UbcH5B~ubiquitin 
conjugate, thus preventing analysis. Coomassie-stained bands are in red. 
f, Overlap of the UbcH5B binding site on HOIP RING2L with the binding 
site of the acceptor ubiquitin. Left, UbcH5B~ubiquitin,; (orange/cyan) 
interaction with RING2L (green) from the HOIP RBR/E2~ubiquitin 
complex. Right, RING2L interaction with two ubiquitin molecules 
arranged in linear fashion, mimicking the HOIP RING2L~ubiquitingonor 
to ubiquitinacceptor (UBdon, Ubacc) transfer complex (PDB: 4LJP (ref. 14)). 
Despite the fact that the placement of the donor ubiquitin in the 4LJP 
structure results from a crystal contact, this ubiquitin exhibits a position 
identical to that of the activated ubiquitin bound to UBR2 in the HOIP 
RBR/UbcH5B~Ub complex. It should be noted that the UBR2 interaction 
with hy) is missing because the RING2L from the crystal neighbour 
presenting the donor ubiquitin pushes hy» into a different conformation. 
Importantly, in the HOIP RBR/UbcH5B~ubiquitin complex (left) the 

E2 binds RING2L in a region that overlaps with the binding site for the 
acceptor ubiquitin (Ubacc, dark blue) in the RING2L~ubiquitingonor 

to ubiquitinacceptor transfer complex (right). This highlights how the 
E2~ubiquitin conjugate and the acceptor ubiquitin (which is the substrate 
of the E3 reaction) cannot bind the RBR at the same time, thus making 

a HECT-like transfer a requirement in the E3 ligase mechanism of RBR 
proteins. 
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Extended Data Figure 7 | Catalytic centre of the E2~ubiquitin/HOIP 
RBR E3 transfer complex. a, Close-up view of the catalytic centre of the 
transfer complex shows conservation of the contact conduits. Top left, 
close-up view of the catalytic centre in the HOIP/UbcH5B~ubiquitin 
transfer complex. Contact conduits 1 and 2 are highlighted with grey 
backgrounds. HOIP catalytic cysteine C885 is depicted in sphere 
representation. K85 replacing the catalytic cysteine (C85) in UbcH5B 
and ubiquitin G76 are displayed in stick representation, featuring the 
UbcH5B~ubiquitin linkage. Top right, model of the conduits in a Parkin/ 
UbcH5B~ubiquitin complex. The structure of Parkin RING2 (from auto- 
inhibited Parkin, PDB: 411H (ref. 8)) was overlaid on that of HOIP 
RING2 indicating equivalent contact conduits. Bottom left, analogous 
model for HHARI using RING2 from auto-inhibited HHARI (PDB: 
AKBL (ref. 11)). Bottom right, model of the conduits in a HOIP/ 
UbcH7~ubiquitin complex. The model was generated from PDB entry 
4Q5E (ref. 55) with UbcH7 (with the free catalytic cysteine C86 displayed) 
overlaid on UbcH5B of the HOIP/UbcH5B~ubiquitin transfer complex. 
The structure of the HOIP/UbcH5B~ubiquitin transfer complex and the 
other models depicted indicate a conservation of the contact conduits. 
Mechanistically, the conduits allow for the RBR catalytic cysteine and the 
E2 catalytic cysteine~ubiquitin linkage to be in close proximity, which 
serves as main driving force of the transesterification reaction. A reaction 
driven mainly by proximity is also in agreement with the chemical nature 
of the catalytic cysteine, which has a pK, of ~8 for the free amino acid. 
This allows the cysteine to naturally deprotonate, without an absolute 
need for HOIP H887 (refs 10, 14), before attack of the ubiquitin G76 
carbonyl function. In addition, the thioester linkage is far more labile 
than for example an amide bond, thus further facilitating a proximity- 
mediated reaction*®. However, in light of the geometric arrangement 


observed, additional subtle catalytic contributions of H887 in supporting 
the transition state of the reaction and/or re-protonation of the E2 catalytic 
cysteine are in principle possible. This prospect is particularly intriguing 
because UbcH7 exhibits a potential break in conduit 2 (between H887 and 
H119), yet provides its ‘own’ histidine (H119) to the catalytic centre. 

b, Thioester transfer assays for HOIP contact conduit mutants. Left, 
thioester transfer assays show that the D983A and E976A mutations 
strongly affect activity. This is consistent with D983 forming the di-Arg 
binding motif in UBR2 (see Extended Data Fig. 6) and E976 bridging the 
three proteins in the complex. In contrast, the H887A mutation does not 
have a marked effect, in agreement with published results!°'*. The Q974A 
mutation also does not have a strong effect, pointing to a weak auxiliary 
function of this residue in support of the critical D983 (Coomassie-stained 
bands in red). Right, the UbcH5B R90A mutation shows a moderate 

yet significant effect, in line with the structure, which suggests a more 
pronounced effect for HOIP E976A than for UbcH5B R90A. Surprisingly, 
mutation of the catalytic D117 in UbcH5B, which is essential for classic 
RING-supported catalysis, shows a positive effect on the HECT-like 
thioester transfer, further emphasizing a separate mechanism for RBR 
HECT-like catalysis (as outlined in Fig. 2 and Extended Data Fig. 4). 

The gain of function of the D117A mutation also points to a trade-off 

for this E2 residue to participate in the classic RING-supported versus 
RBR HECT-like E2/E3 mechanisms (mean activity + s.e.m. (n= 3), 
two-tailed unpaired Student's t-test; **P < 0.01; representative gels shown 
in Supplementary Fig. 1) c, Polyubiquitination assays for HOIP contact 
conduit alanine mutants. These assays show similar activity profiles as the 
thioester transfer assays except for the H887A mutation, which is essential 
for amide bond formation in the second transfer reaction'”™*. 
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Extended Data Figure 8 | See next page for caption. 
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Extended Data Figure 8 | The hp2-IBR module contains an additional 
UBR (UBR3) that binds an allosteric ubiquitin using binding mode 

2. a, UBR3 binds the allosteric ubiquitin (Ub,.) using binding mode 

2. Cartoon depicting the overall features of binding mode 2 in UBR3. 

The binding of Ubaip is largely analogous to the binding of the activated 
ubiquitin by UBR2 in the helix-IBR-fold of hy,-RING2L (Extended Data 
Fig. 6a). Yellow squares indicate hydrophobic patches. b, Location of the 
UBR3/Ubalio interface in the overall complex and its relation to the UBR1/ 
Ubact interface. The additional ubiquitin (Ubu) binds to UBR3 in the 
hpg2-IBR module immediately across UBR1 and the activated ubiquitin. 
The electron density map (feature-enhanced map”, grey, contour level 

o = 1) for the additional ubiquitin is also depicted on the right. c, Details 
of the UBR3/Ubano interaction and its similarity to the interaction of 
phospho-ubiquitin (pUb) with Pediculus humanus Parkin (Ph-Parkin). 
Top left, close-up of HOIP UBR3. Ub, binds to the hg,-IBR module 
with additional contacts to hg). Depicted are hydrophobic residues of 
HOIP (V789, L790, F796, W798 and 1807) interacting with the ubiquitin 
canonical hydrophobic patch (L8, 144 and V70/not shown) and a second 
hydrophobic patch (L71 and L73/not shown), with the critical HOIP 

1807 emphasized. The di-Arg binding motif is also depicted, with E809 
coordinating ubiquitin R72 and R74 and aligning the ubiquitin C terminus 
ina parallel manner with sheet 82 of the IBR. A red arrow indicates that 
UBR3 sterically allows the binding of di-ubiquitin/polyubiquitin chains 
on the C-terminal side of the bound ubiquitin (see also Extended Data 
Fig. 9). Ubiquitin Ser65 is indicated for comparison with the pUb/Ph- 
Parkin structure (bottom). Top middle, close-up on the di-Arg binding 
motif. Top right, close-up on the additional contacts between hg; and 
Ubatlo- HOIP R770 interacts with D766, which makes contacts to Y778, 
and the backbone carbonyl functions of ubiquitin K63 and E64. Bottom, 
the recent structure of phospho-ubiquitin bound to Ph-Parkin (PDB: 
5CAW (ref. 13)) reveals a similar mechanism. Left, close-up with the 
residues corresponding to those in HOIP depicted. Middle, close-up of the 
di-Arg motif. Ph-Parkin D346 coordinates R72 similar to HOIP UBR3/ 
Ubatio- The chemical tether introduced in the pUb/Ph-Parkin structure 
between the phospho-ubiquitin C terminus and a non-conserved Cys 

in Ph-Parkin (C349) shifts ubiquitin R74 away from Ph-Parkin D346 
indicating that the di-Arg binding motifs of HOIP and Parkin undergo 

a similar interaction with ubiquitin or phospho-ubiquitin respectively. 
Right, the Ph-Parkin interaction equivalent to the HOIP hy, interaction 
involves a ubiquitin phospho-serine 65 (pSer65) binding pocket in Ph- 
Parkin. Phospho-serine 65 is directly coordinated by R307 and Y314 and 
also H304, which is positioned similarly to HOIP D766. d, The binding of 
phospho-ubiquitin propagates the formation of UBR1 in Parkin. Overlay 
of Ubatio (slate) bound to HOIP (magenta) and phospho-ubiquitin (light 
blue) bound to Ph-Parkin (grey blue). Ubiquitin binding propagates 

a straight conformation of hg, and an opening of the IBR relative to 

the extended RINGI (only hg; is shown). The tether introduced in the 
phospho-ubiquitin/Ph-Parkin interaction appears to exert a strain on IBR 
loop 2 and thus the IBR. This suggests that in the absence of the artificial 
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tether phospho-ubiquitin can propagate the formation of a fully functional 
UBRI (binding Ub, t) in Parkin concomitantly to relieving the UBL 
autoinhibition, as elegantly demonstrated by Komander and colleagues”?. 
e, The interaction of phospho-ubiquitin with a site analogous to UBR3 

in Ph-Parkin causes a straight conformation of hg) and reorientation of 
the IBR relative to RINGI as prerequisite to accommodate the activated 
ubiquitin. Left, full-length auto-inhibited human Parkin (PBD: 5C1Z 

(ref. 12)). Right, tethered phospho-ubiquitin in complex with AUBL- 
Ph-Parkin (PDB: 5CAW (ref. 13)). f, Conservation of UBR3 in HHARI. 
HHARI features a helix-IBR module similar to that of HOIP, with 
conserved hydrophobic patches and a polar residue (Q288) in the position 
of HOIP E809 that is in principle capable of binding the ubiquitin di-Arg 
motif. The hp:—-IBR region from auto-inhibited HHARI (PDB: 4KBL 

(ref. 11)) is displayed. As for auto-inhibited Parkin, a key difference from 
active HOIP is the kink in helix hg of auto-inhibited HHARI (see also 
panel g and Extended Data Fig. 5c). Taken together, these features indicate 
an overall similarity and the existence of a UBR3 that is allosterically 
linked to UBR1 in HOIP, Parkin and HHARI and potentially also other 
RBR proteins (see alignment in Supplementary Data 2). g, HHARI UBA 
domain (pink) binds HHARI RBR (light orange) in an equivalent position 
as Ubaio binding to HOIP, but promotes an inhibitory conformation of 
RING1-IBR that cannot bind the activated ubiquitin (PDB: 4KBL (ref. 11)). 
h, UBR3 is a regulatory hotspot and UBR3/UBRI crosstalk. Parkin auto- 
inhibition is facilitated by an UBL domain and is inherently linked to the 
equivalent of UBR3 and counteraction by phospho-ubiquitin binding. 
Different to Parkin, no structure of auto-inhibited HOIP RBR is available 
and thus the conformation of auto-inhibited HOIP is not known. However 
HHART is, like HOIP, auto-inhibited by its UBA domain and the auto- 
inhibited structure of HHARI has been solved previously (PDB: 4KBL 
(ref. 11)). This structure reveals that unlike the UBL of Parkin, the UBA 
of HHARI directly utilizes the region of UBR3 for binding and auto- 
inhibition, which includes the kink in hg and a relative RING1-IBR 
positioning that is incompatible with binding of Ubac as observed in 

this study, further underlying a regulatory ‘hotspot’ function of the 

RBR UBR3/UBRI. Left, the UBA domain (pink) of HHARI utilizes an 
anti-parallel 8-sheet anchor with strand 82 of the HHARI IBR (light 
orange) positioning the UBA to induce a kink in helix hp, compared to its 
conformation in active HOIP (magenta) and counteracting the formation 
of a productive UBR3 and UBR1. Right, binding of the Ub, to UBR3 

in mode 2 utilizes a parallel 8-sheet anchor (centred around the di-Arg 
binding interaction) with strand 82 of the IBR in active HOIP (magenta) 
inducing a straight conformation of hg) and a conformation of UBR1 
suited to bind the activated ubiquitin of the E2~Ub,<t conjugate. Putative 
shifts for an analogous UBR1 formed in HHARI are indicated. Of note, 
the structure of auto-inhibited HOIP is not known and therefore the 
placement of the UBA in the schematic illustration of Extended Data 

Fig. 10 is deduced based on the similar auto-inhibition of HOIP and 
HHARI by their UBA domains, which still needs to be demonstrated. 
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Extended Data Figure 9 | See next page for caption. 
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Extended Data Figure 9 | UBR3 interacts with di-ubiquitin and 
allosterically promotes E2~ubiquitin binding and HOIP RBR 
activation. a, ITC experiments analysing the binding of mono-ubiquitin 
or linear di-ubiquitin to an isolated HOIP RBR/E2~ubiquitin complex. 
While the binding of mono-ubiquitin is below the sensitivity of the 
experimental setting (Ky > 501M), the binding of linear di-ubiquitin 
exhibits a Kg of 7.141M. b, ITC experiments analysing the binding 

of UbcH5B~ubiquitin to HOIP RBR in the absence and presence of 
different di-ubiquitin chains. Top left, the binding stoichiometry of 
UbcH5B~ubiquitin and wild-type HOIP RBR is n = 1.8, indicating 

that two UbcH5B~ubiquitin molecules interact with one RBR through 
UBR1/2 (catalytic binding) and UBR3 (binding of the ubiquitin moiety of 
the E2~ubiquitin conjugate) with a combined overall Ky of 1.6 1M. The 
graph shows a single titration step, indicating ‘crosstalk’ between UBR3 
and UBR1. Top right, the presence of K48-linked di-ubiquitin leads to 

1:1 binding (n = 0.9) of UbcH5B~ubiquitin to HOIP RBR, indicating 
that di-ubiquitin occupies UBR3 and limits UbcH5B~ubiquitin binding 
to only its bona fide catalytic binding site (with ubiquitin binding to 
UBR1/2 and UbcHS5B binding to RING1/RING2L). However, the presence 
of K48-linked di-ubiquitin results in the lowest affinity (Ka=3.4,1M) 

for the binding of the conjugate to the RBR, indicating a negative 

effect of this linkage compared to the other di-ubiquitin entities tested. 
Bottom left, K63-linked di-ubiquitin has a more favourable effect on 
UbcH5B~ubiquitin binding (Ky=2.0|.M), Bottom right, the strongest 
allosteric effect is observed in the presence of linear di-ubiquitin, which 
enables sub-micromolar binding (Ky= 600 nM) of UbcH5B~ubiquitin. 
These results show that linear di-ubiquitin functions as a potent activator 
of HOIP RBR by binding to UBR3 (see also below and Fig. 4). While 

the structure depicts the interactions of one ubiquitin unit with UBR3, 

a second ubiquitin C-terminal to the UBR3-interacting ubiquitin may 
undergo further interactions with the IBR (as indicated by the arrow 

in Extended Data Fig. 8c). The cartoon representations summarize the 
configuration of each ITC experiment. c, Polyubiquitination assays of 
UBR3 mutants. The HOIP RBR I807A, E809A and R770A mutants exhibit 
a marked reduction in activity, supporting the importance of UBR3 in 
HOIP function. d, Activation of HOIP RBR by di-ubiquitin. While wild- 
type HOIP RBR is activated by the presence of increasing concentrations 
of wild-type linear di-ubiquitin, wild-type linear di-ubiquitin only has a 
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weak effect on activation of the UBR3 R770A mutant (similar to the I807A 
and E809A mutants in Fig. 4b). Additionally the di-ubiquitin mutant 
144A, which is mutated at a critical UBR3-interacting residue in both 
ubiquitin units, does not have an activating effect on wild-type HOIP 
RBR thioester activity (mean activity + s.e.m. (1 =3), one-way ANOVA 
followed by Tukey’s post hoc test; *P < 0.05; **P < 0.01; ***P < 0.001; 
NS, not significant; representative gels shown in Supplementary Fig. 1). 

e, Effect of linear versus K48-linked di-ubiquitin on HOIP RBR thioester 
transfer activity. In contrast to the ITC binding studies in b, linear and 
K48-linked di-ubiquitin are both able to increase the thioester transfer 
activity of HOIP RBR (although the experimental setup necessary to 
investigate the K48-linkage resulted in larger error; see Methods; mean 
activity + s.e.m. (n = 3), one-way ANOVA followed by Tukey’s post 

hoc test; *P < 0.05; **P < 0.01; NS, not significant; representative gels 
shown in Supplementary Fig. 1). These results show an UBR3-dependent 
activating effect of di-ubiquitin, and thus potentially of polyubiquitin 
chains. However, whether HOIP UBR3 acts as a universal ubiquitin sensor 
or has a preference for linear ubiquitin over other types of linkage needs 
to be further examined through careful investigations also including full- 
length proteins of the LUBAC in cellular settings. Additionally, although 
there is a substantial gap between UBR3 and the position of the acceptor 
ubiquitin, longer acceptor ubiquitin chains might be able to bridge this 
gap and mediate a cooperative effect between the two sites. This would 

be consistent with a recent publication showing that the presence of K63- 
linked ubiquitin chains is frequently necessary for the formation of linear 
polyubiquitin chains**. f, Protein expression levels for the NF-«B reporter 
assays in cells shown in Fig. 4d. Shown are anti-Flag immunoblots of 
wild-type HOIP and mutants and anti-myc immunoblots of HOIL-1L, 
demonstrating similar protein expression levels in different cell lysates. 
Lysates were also probed by immunoblotting for actin as a loading control. 
Uncropped blots are shown in Supplementary Fig. 1. g, Time course 

of HOIP~ubiquitin thioester transfer assay. Left, SDS-PAGE showing 
time course of HOIP~ubiquitin thioester transfer assay. Coomassie- 
stained bands in red visualized using LI-COR Odyssey at 700 nm. Right, 
plot of quantified HOIP~ubiquitin thioester transfer assay time-course 
(mean + s.e.m., n= 2). The 10-s time point used in the end-point assays 
throughout the study is highlighted in red. 
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Extended Data Figure 10 | Schematic of RBR mechanism: HOIP RBR 
activation and E3 ligase cycle. HOIP RBR is initially auto-inhibited by its 
UBA domain. Sequestration of the auto-inhibitory HOIP UBA domain by 
HOIL-1L*”' releases the conformational restraint exerted by the UBA, 
allowing formation of UBR1 and UBR3. Binding of a ubiquitin entity such 
as a linear ubiquitin chain to UBR3 stabilizes the active conformation of 
UBRI and the RINGI-IBR arm, facilitating binding of the E2~ubiquitin 
conjugate. In the subsequent HOIP/E2~ubiquitin transfer complex, the 
E2~ubiquitin conjugate is engaged in a clamp-like manner bringing the 
RBR active cysteine and the E2~ubiquitin thioester in close proximity, 


non-inhibited 


fully active 
HOIP RBR 


he® 


acceptor 
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ultimately leading to the transfer of the ubiquitin to the RBR cysteine. 

The E2 then vacates the complex, freeing the site for binding of the 
acceptor ubiquitin, the N-terminal amine of which attacks the RBR 
thioester”4, Once the ubiquitin chain linkage is formed, the ubiquitinated 
substrate/growing ubiquitin chain must exit RING2L to enable binding 

of a new E2~ubiquitin conjugate for the next loading of the RBR in the 
HECT-like E3 ligase cycle. The growing ubiquitin chain could be retained 
near the RBR by the HOIP NZF domains, HOIL-1L or SHARPIN!*!7*, 
directly linking the HECT-like mechanism to co-operative processes 
within the LUBAC. 
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Structure of transcribing mammalian RNA 


polymerase II 


Carrie Bernecky!, Franz Herzog’, Wolfgang Baumeister’, Jiirgen M. Plitzko* & Patrick Cramer 


RNA polymerase (Pol) II produces messenger RNA during 
transcription of protein-coding genes in all eukaryotic cells. The Pol II 
structure is known at high resolution from X-ray crystallography 
for two yeast species!->. Structural studies of mammalian Pol II, 
however, remain limited to low-resolution electron microscopy 
analysis of human Pol II and its complexes with various proteins*"°. 
Here we report the 3.4A resolution cryo-electron microscopy 
structure of mammalian Pol II in the form of a transcribing complex 
comprising DNA template and RNA transcript. We use bovine Pol II, 
which is identical to the human enzyme except for seven amino- 
acid residues. The obtained atomic model closely resembles its 
yeast counterpart, but also reveals unknown features. Binding of 
nucleic acids to the polymerase involves ‘induced fit’ of the mobile 
Pol II clamp and active centre region. DNA downstream of the 
transcription bubble contacts a conserved “TPSA motif in the jaw 
domain of the Pol II subunit RPBS5, an interaction that is apparently 
already established during transcription initiation’. Upstream DNA 
emanates from the active centre cleft at an angle of approximately 
105° with respect to downstream DNA. This position of upstream 
DNA allows for binding of the general transcription elongation 
factor DSIF (SPT4-SPT5) that we localize over the active centre 
cleft in a conserved position on the clamp domain of Pol II. Our 
results define the structure of mammalian Pol II in its functional 
state, indicate that previous crystallographic analysis of yeast Pol II 
is relevant for understanding gene transcription in all eukaryotes, 
and provide a starting point for a mechanistic analysis of human 
transcription. 

To determine the high-resolution structure of mammalian Pol I], 
we prepared the bovine enzyme from calf thymus (Methods). Purified 
bovine Pol II contained all 12 subunits (RPB1-RPB12) in apparently 
stoichiometric amounts and bound to human Gdown1, an additional 
metazoan-specific Pol II subunit!! (Extended Data Fig. 1a). The 
polymerase also bound a synthetic DNA-RNA scaffold to form an elon- 
gation complex (EC) that was active in extending the RNA transcript 
(Extended Data Fig. 1b, c). Crystallization trials were unsuccessful, but 
we could determine a high-resolution structure of the bovine Pol II 
EC by cryo-electron microscopy (cryo-EM) and single particle recon- 
struction (Methods). 

The EC sample was used for cryo-EM data collection with a K2 
direct electron detection device (Gatan). These data revealed particles 
of the expected size that resulted in defined 2D class averages 
(Extended Data Fig. 1d, e). Unsupervised 3D classification of 409,401 
particle images led to a reconstruction of the Pol II EC from 264,134 
particles at an overall resolution of 3.4 A and a local resolution of 3.0A 
in the active centre (EC1, Extended Data Figs 2 and 3). Upstream 
DNA and the RPB4-RPB7 stalk were flexible, but alternative sorting 
of particles led to EC reconstructions with improved density for the 
RPB4-RPB7 stalk (EC2; 219,265 particles) and for upstream DNA 
(EC3; 184,122 particles) at resolutions of 3.6A and 3.7A, respectively. 


1 


Density for Gdown1 was not observed in these reconstructions, prob- 
ably because Gdown1 is flexibly tethered to Pol II and requires inter- 
actions with additional factors to adopt a defined location that could 
be detected by cryo-EM. 

The vast majority of imaged particles encompassed the 12-subunit 
Pol II EC. A minor fraction of EC particles with a similar angular 
distribution displayed less density for the jaw-lobe region, indicating 
mobility as suggested by a previous low-resolution EM study of human 
Pol II*. An additional fraction of particles lacking nucleic acids was 
also observed (Extended Data Fig. 2). These particles contained free 
Pol I] with a flexible clamp, also consistent with previous observa- 
tions*. The corresponding reconstruction showed that parts of the 
active centre region and hybrid-binding domain were also flexible. 
This flexibility was only observed in the absence of nucleic acids, indi- 
cating that Pol II undergoes an induced fit when binding to nucleic 
acids. In the absence of nucleic acids, fork loop 2 was rearranged, 
covering a positively charged patch that interacts with the downstream 
edge of the single-stranded region of the DNA non-template strand 
in the EC. 

The mobility of the clamp was previously inferred from crystallo- 
graphic studies of yeast Pol II. Crystal structures of the ten-subunit 
core Pol II, lacking the Rpb4—Rpb7 stalk, trapped the clamp in two 
different open positions”, whereas binding of nucleic acids resulted in 
a defined, closed position of the clamp’’ that corresponded to the one 
observed here. The same closed position was observed in the absence of 
nucleic acids in crystals of the complete Pol II containing the stalk!?4, 
although we observed here mobility of the clamp even in the presence 
of the stalk. In previous crystals, the stalk was involved in crystal con- 
tacts, but its conformation nevertheless matched the one seen in EM 
studies of yeast Pol Il'°. In the mammalian Pol II EC presented here, 
the stalk adopts a different orientation, which resembles that seen in 
a yeast EM structure of the core initiation complex bound by the core 
Mediator'®. Taken together, crystallization traps Pol II conformations 
that exist in solution, whereas EM has the potential to unveil multiple 
conformations, providing insights into the conformational dynamics 
of the complex. 

The cryo-EM density for EC] revealed protein side chains and single 
nucleotides, and was comparable in quality to crystallographic maps 
calculated with refined model phases at a similar resolution (Fig. 1a). 
We built an atomic model based on the homologous yeast Pol II 
structure and refined the model in real space. The resulting struc- 
ture included 95% of all Pol II residues (omitting the flexible carboxy 
(C)-terminal domain of Pol II subunit RPB1) and showed very good 
stereochemistry (Extended Data Table 1). The structure represents one 
of only four asymmetric macromolecular structures with a molecular 
mass of around 1 MDa or less that were thus far resolved by cryo-EM 
at near-atomic resolution (<4 A)!*. 

The overall structure of the mammalian Pol II EC (Fig. 1b) resem- 
bles the yeast EC crystal structures!*”°, but also shows differences 
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in surface regions of the polymerase (Extended Data Fig. 4a, b and 
Extended Data Table 2). First, mammalian Pol II exhibits an inser- 
tion in subunit RPB8 that contacts the RPB1 foot domain, which is 
rearranged compared with the yeast enzyme (Fig. 2a). The Saccharomyces 
cerevisiae foot domain forms a transient contact with the coactivator 
complex Mediator’. It is possible that these differences affect mam- 
malian Mediator binding. Second, mammalian Pol II contains two 
insertions in the region of the Pol II pore that accommodates the RNA 
cleavage factor TFIIS beneath the active site (Fig. 2b). The insertion 
820-821 contains a new negatively charged helix (called here a16-2) 
that may interact with a positively charged region of TFIIS domain 3. 
The insertion in strand 832 is in a position to interact with the linker 
between TFIIS domains 2 and 3. This may be relevant for binding fac- 
tors with TFIIS-like domains that are present in mammalian cells, but 
not in yeast. Third, the structure reveals that the RPB5 jaw domain is 
rotated by ~5° towards downstream DNA (Fig. 2c, d and Extended 
Data Fig. 4c). Fourth, the EC1 map reveals nearly continuous density 
for the trigger loop in the active centre. The trigger loop adopts an open 
conformation, similar to the trapped state observed in the presence 
of backtracked RNA”! and an open state observed in a recent X-ray 
structure of a yeast Pol II EC” (Extended Data Fig. 4d). 

With respect to the nucleic acids, downstream DNA enters the active 
centre cleft and unwinds before the active site when the template strand 
passes over the bridge helix. The RNA transcript emanates from the active 
site aspartate loop and forms a hybrid with the DNA template strand. 
RNA separates at the upstream end of the hybrid and passes beneath 
the lid loop through the proposed RNA exit tunnel to reach the Pol II 
surface near the dock domain when it is 14 nucleotides long. Density for 
exiting RNA extends beyond this point, but could not be modelled owing 
to its increasing flexibility on the enzyme surface. The upstream DNA 
duplex also shows weaker density, consistent with its known mobility”’, 
but adopts a defined position in the EC3 reconstruction (Fig. 3a). 

Most interactions between Pol II and the nucleic acids occur in the 
region of the DNA-RNA hybrid and generally involve residues that 
are either identical or conserved between yeast and human enzymes 
(Fig. 3b, c). Density for an insertion within the RPB2 lobe (loop 89-810) 
approached the downstream DNA. This loop is conserved in metazoa 
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Figure 1 | Cryo-EM structure of mammalian 
Pol II EC at 3.4A resolution. a, Representative 
regions of the cryo-EM density for EC1 with the 
refined model superimposed. Depicted are (from 
left to right) RPB1 helix «19, RPB8 strand (8, the 
DNA-RNA hybrid, and the active site aspartate 
loop with the bound catalytic metal ion A and the 
3’-nucleotide of the RNA transcript. b, Ribbon 
model. The views correspond to the previously 
used ‘top’ and ‘side’ views of yeast Pol II and are 
related by a 90° rotation around a horizontal axis. 
Black spheres indicate the location of residues that 
are not identical between bovine and human Pol II, 
bovine indicated second (three out of seven 
residues, the remaining four are disordered). The 
final model lacked several short surface loops and 
flexible N-terminal residues. 


DNA 


but not yeast, and contains two lysine residues near DNA. Further 
downstream, DNA contacts the Pol II jaw domain of subunit RPB5, 
which was moved by up to 3A compared with the yeast EC. The con- 
tact is made by a “TPSA' motif comprising four conserved amino-acid 
residues that form the first turn of helix «6 in RPB5 (Fig. 2d). A contact 
of downstream DNA with RPBS is already established in the closed 
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Figure 2 | Selected mammalian-specific features that differ from yeast 
Pol II (PDB accession number 1Y1W). a, An insertion in RPB8 anda 
rearrangement in the RPB1 foot domain in bovine Pol II lead to increased 
contacts in this region. Bovine Pol II is shown in darker colours (RPB1 
dark grey; RPB8 dark green) and yeast Pol II is shown in lighter colours 
(RPB1 light grey; RPB8 light green). Insertions within bovine Pol II are 
coloured purple and insertions within yeast Pol I] are coloured brown. 

b, Two insertions in RPB1 near the pore and TFIIS-binding region. 
Colouring is as in a, and TFHS is in blue. c, Changes in the RPB5 jaw 
domain compared with the structurally conserved RPB5 assembly domain. 
Insertions are coloured as in a. d, Contact of the TPSA motif (black) in 
RPB5 with downstream DNA (blue). 
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Figure 3 | Pol II-nucleic-acid interactions. 

a, EM density for nucleic acids. The density for 
the B-factor sharpened Pol I] EC1 map is shown 
in grey mesh. The density for the unsharpened 
Pol II EC3 map is shown in blue mesh. The 
view corresponds to the previously used ‘side’ 
view. b, Pol II residues interacting with nucleic 
acids. Schematic of the nucleic-acid scaffold 
with RNA in red, template DNA in blue, and 
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non-template DNA in cyan. Dark filled circles 
represent nucleotides that were well resolved both 
in this study and in the previous yeast EC X-ray 
structure (PDB accession number 1Y1W). Lighter 
filled circles represent nucleotides that could be 
additionally modelled. Unfilled circles represent 
nucleotides that were not modelled. Residues 
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subunit chain identifier (A-L for RPB1-RPB12) 
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yeast and bovine Pol II are in green, similar 
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residues are black. The polymerase domains in 
proximity to single-stranded non-template DNA 
are in black, whereas domains in proximity to 
upstream or downstream DNA are denoted in 
orange. c, Pol II residues interacting with the 
DNA-RNA hybrid. 
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promoter complex and an open complex mimic’, indicating that 
it is maintained during the transition from initiation to elongation 
(Extended Data Fig. 4e). These observations are consistent with the 
model that downstream DNA is moved into the Pol I] cleft by a trans- 
location along RPB5 upon TFIIH action and DNA opening””>. Studies 
of ten-subunit yeast Pol II have revealed a similar contact with down- 
stream DNA”*4, 

The orientation of upstream DNA with Pol II deviates from previ- 
ous observations made in the yeast system. Upstream DNA emanates 
from the Pol II cleft between the clamp and protrusion domains at 
an angle of ~105° with respect to downstream DNA, compared with 
the previously described ~90° on the basis of fluorescence resonance 
energy transfer measurements”® and ~130° on the basis of a recent 
X-ray structure of a 10-subunit yeast Pol II with a larger, 15-nucleotide 
DNA mismatch bubble”. The deviation in the angle of the upstream 
DNA could bea result of the larger mismatch region, the redistribution 
of positive charges on the protrusion between yeast and mammals, or 
an altered conformation of loop 828-829 on the RPB2 wall domain that 
approaches the minor groove of upstream DNA. 

The observed position of upstream DNA is compatible with bind- 
ing of the elongation factor DSIF (SPT4-SPT5) to the clamp as mod- 
elled on the basis of the archaeal and yeast complexes, whereas the 
previous position of upstream DNA clashed with DSIF****. To test 
our model, we prepared an EC with bound DSIF and determined the 
location of DSIF with the use of negative stain EM and crosslinking 
coupled to mass spectrometry. As predicted, SPT4 and the NusG amino 
(N)-terminal (NGN) domain of SPT5 occupied the top of the cleft, 
bridging between the clamp on one side and the lobe and protrusion on 
the other side of Pol II (Fig. 4 and Supplementary Table 1). Modelling 
on the basis of the previous X-ray structure of an archaeal clamp com- 
plex with SPT4-SPT5 places positively charged regions of the SPT5 
NGN domain in close proximity to the observed position of upstream 
DNA and non-template DNA (Extended Data Fig. 5), consistent with 
a role of this conserved family of elongation factors in stabilizing the 
upstream edge of the transcription bubble??””. 

In summary, high-resolution structural analysis of a mammalian 
Pol II has become possible owing to advances in cryo-EM with the 
use of crystallization-grade preparations of the enzyme that have been 


available for years but never produced diffraction-quality crystals. 
The cryo-EM density map for the mammalian Pol II EC resembles 
in quality and resolution the best crystallographic maps obtained for 
the corresponding yeast complex. Comparison of the mammalian 
cryo-EM structure with the yeast X-ray structures reveals differences 
between these highly similar enzymes and provides insights into the 
conformational control of enzyme function. Most importantly, this 
work provides the basis for future structure determination of mam- 
malian Pol II transcription complexes with mammalian-specific 
factors. 
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Figure 4 | Location of DSIF on the Pol II EC. Negative stain EM and 
crosslinking of the Pol II-DSIF EC. A difference map between the negative 
stain Pol II-DSIF EC reconstruction at 26 A resolution and the EC model 
is shown in yellow. Pol II residues observed to crosslink to DSIF are shown 
as black spheres. Although present in the Pol II-DSIF EC, nucleic acids are 
not visible owing to the use of negative stain. 
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Online Content Methods, along with any additional Extended Data display items and 
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METHODS 


No statistical methods were used to predetermine sample size. The experiments 
were not randomized. The investigators were not blinded to allocation during 
experiments and outcome assessment. 

Purification of bovine Pol II. Bovine Pol II was prepared as described! with 
modifications. Unless otherwise noted, all steps were completed at 4°C. Protease 
inhibitors included 1mM PMSE, 1 mM benzamidine, 60|1M leupeptin, and 200 1M 
pepstatin. Calf thymus was homogenized for 3 min in buffer A (50mM Tris, pH 
7.9 at 4°C, 1mM EDTA, 10M ZnCly, 10% glycerol, 1mM DTT, protease inhib- 
itors) using a 2] blender (Waring). The homogenized material was centrifuged 
and the supernatant filtered through two layers of Miracloth. A 5% solution of 
polyethyleneimine, pH 7.9 at 25°C, was added to a final concentration of 0.02%, 
and the material was stirred for 10 min then centrifuged. The resulting pellets 
were washed with buffer A before resuspension in buffer A (0.15 M ammonium 
sulfate). After centrifugation, the conductivity of the supernatant was adjusted to 
that of buffer A (0.2 M ammonium sulfate), and the resulting material was loaded 
on a 225-ml MacroPrepQ column equilibrated in buffer A (0.2 M ammonium 
sulfate). The column was washed with two column volumes of buffer A (0.2 M 
ammonium sulfate), followed by Pol II elution with buffer A (0.4M ammonium 
sulfate). The eluate was precipitated by addition of finely ground ammonium sul- 
fate added to 50% saturation, and pellets were collected by centrifugation. The 
pellets were resuspended in buffer A, and the conductivity was adjusted to that of 
buffer A (0.15 M ammonium sulfate). The material was clarified by centrifugation, 
and further purified using a 5-ml gravity flow column of 8WG16 (aRPB1 CTD) 
antibody-coupled sepharose equilibrated in buffer A (0.15 M ammonium sulfate). 
After application of the input material, the antibody column was washed with five 
column volumes of buffer A (0.5 M ammonium sulfate), sealed, and allowed to 
equilibrate to room temperature (20-25 °C) for 15 min. Pol II was eluted using 
buffer A (0.5 M ammonium sulfate, 50% (v/v) glycerol), and Pol-I-containing 
fractions were immediately mixed with buffer A (2mM DTT, lacking glycerol 
and protease inhibitors). The diluted material was centrifuged and subjected 
to anion exchange chromatography using a UNO-Q column equilibrated in 
buffer A (0.1M ammonium sulfate, 2mM DTT, lacking protease inhibitors). Pol II 
was eluted using a linear gradient from 0.1 M to 0.5 M ammonium sulfate in buffer A 
(2mM DTT, lacking protease inhibitors). For the purification of 12-subunit bovine 
Pol II, the Gdown1-free Pol II fraction was applied to a Sephacryl S-300 HiLoad 
sizing column equilibrated in buffer B (150 mM NaCl, 5mM HEPES pH 7.25 at 
25°C, 101M ZnCl, 10mM DTT). For the purification of bovine Pol II containing 
Gdown1, the Gdown1-free Pol II fraction was incubated with a 3x molar excess of 
human Gdown] for 1h at 4°C before application to the Sephacryl S-300 HiLoad 
sizing column. Pol-II-containing fractions were concentrated using a 100-kDa 
cutoff Amicon concentrator to a final concentration of 2-4mgmlI"!. 
Preparation of recombinant human proteins. Gene-optimized human Gdown1 
(Life Technologies) was cloned into pOPINB (N-terminal Hiss tag and 3C pro- 
tease site). After transformation, Escherichia coli BL21(DE3)RIL cells were grown 
at 37°C in Lysogeny broth (LB) medium to an absorbance at 600 nm, Agoonm of 
0.5 before protein expression with 0.5 mM IPTG for 3-4h at 37°C. Subsequent 
steps were completed at 4°C unless otherwise noted. Cells were lysed by sonica- 
tion in buffer C (50 mM HEPES pH 7.5 (25°C), 300mM NaCl, 1 mM CaCh, 10% 
glycerol) supplemented with 10 mM imidazole, 1mM PMSE, 1 mM benzamidine, 
1mM sodium metabisulfite, 1 mM DTT, and 21g ml! DNase I. Cleared lysate was 
subjected to affinity chromatography using Ni-NTA agarose (Qiagen), and excess 
chaperone was removed by washing the resin with a5mM ATP and 2mgml ! 
denatured E. coli protein wash at room temperature in buffer C supplemented 
as above containing 30 mM imizdazole. Protein was eluted with buffer C supple- 
mented as above, but lacking DNase J and containing 250 mM imidazole. Elutions 
were exchanged into buffer C supplemented with 10 mM imidazole and 1mM DTT 
via a PD10 desalting column, followed by 3C protease cleavage at 4°C overnight. 
Cleaved Gdown1 was subjected to reverse chromatography (Ni-NTA agarose) 
followed by dilution with buffer D (50mM HEPES pH 7.5 (25°C), 1mM CaCh, 
10% glycerol, 2mM DTT) to a conductivity of buffer D containing 0.05 M NaCl. 
Diluted protein was subjected to cation exchange chromatography (MonoS 5/50) 
to remove additional chaperone, and eluted with a linear gradient from 0.05 M 
to 0.5 M NaCl in buffer D. The conductivity of the Gdown1-containing fractions 
was again adjusted to that of buffer D containing 0.05 M NaCl, and applied to a 
MonoQ 5/50 anion exchange column. Gdown1 was eluted using a linear gradient 
from 0.05 M to 0.5 M NaCl in buffer D. Fractions containing purified Gdown1 
were pooled, resulting in a final concentration of 1-1.5mgml '. Yield was approx- 
imately 2.5 mg per 2] of E. coli culture. 

Purification of human SPT4 and SPT5 was as described*!, with adaptations. 
Gene-optimized human SPT5 (pMK vector, no tag) and SPT4 were purchased 
from Life Technologies, and SPT4 was recloned into pOPINJ (N-terminal HIS6 
and GST tags followed by a 3C protease cleavage site). SPT4 and SPT5 vectors 
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were co-transformed into E. coli BL21(DE3)RIL cells, which were then grown at 
37°C in LB medium supplemented with 101M ZnCl to Agoo nm = 0.6. Expression 
was induced with 1 mM IPTG for 18h at 18°C. Cells were lysed by sonication in 
buffer E (25 mM Tris pH 7.4 (4°C), 500 mM NaCl, 102.M ZnCh, 5mM DTT) 
supplemented with 5mM imidazole and protease inhibitors (1 mM PMSF, 1mM 
benzamidine, 601M leupeptin, and 200 |1M pepstatin). Soluble material was passed 
over a Ni-NTA agarose column and washed with ten column volumes each of 
buffer E supplemented with 20 mM or 40 mM imidazole before elution in buffer E 
supplemented with 300 mM imidazole. Eluted protein was cleaved with 3C protease 
during overnight dialysis (4°C) against buffer E, then subjected to reverse chro- 
matography. Protein was passed over a HiTrap Q HP anion exchange column to 
remove DNA, and the flow-through fraction containing SPT4/5 was concentrated 
using a 50-kDa cutoff Amicon concentrator to 1-4mgml"1. 

Elongation complex preparation. The nucleic-acid scaffold (Metabion) was 
only slightly modified from what was used for a yeast Pol II EC crystal struc- 
ture” by mutation of the DNA templating base from C to A to generate a fully 
mismatched open bubble and by removing the upstream and downstream CC 
overhangs. A 1.5x molar excess of pre-annealed template DNA (sequence 
5'-AAGCTCAAGTACTTAAGCCTGGTCATTACTAGTACTGCC-3’), non- 
template DNA (sequence 5’-GGCAGTACTAGTAAACTAGTATTGAAAGTA 
CTTGAGCTT-3’), and RNA (sequence 5’-UAUAUGCAUAAAGACCAGGC-3’) 
were incubated with Pol II-Gdown1 at 4°C for 10 min, then 20°C for 15 min. To 
increase the randomness of Pol II EC particle orientations, the resulting complex 
(0.85 1M) was crosslinked with 3mM BS3 (Thermo) for 30 min at 30°C, then 
quenched with 50 mM ammonium bicarbonate. Crosslinked complex was applied 
to a Superdex 200 increase 10/300 GL column equilibrated in buffer B (150 mM 
NaCl, 5mM HEPES pH 7.25 at 25°C, 10.M ZnCh, 10mM DTT). The nucleic-acid- 
containing peak was concentrated to ~0.3 mg ml as described above and used 
immediately for cryo-EM grid preparation. 

Electron microscopy. Four microlitres of sample were applied to glow-discharged 
Quantifoil R 3.5/1 holey carbon grids, which were then blotted and plunge-frozen 
in liquid ethane using a Vitrobot (FEI). Data were acquired using an FEI Titan 
Krios operated in energy-filtered transmission electron microscopy (EFTEM) 
mode at 300kV equipped with a Gatan K2 Summit direct detector. Automated data 
collection was performed using the TOM toolbox*. Movie images were collected at 
a nominal magnification of x 37,000 (1.35 A per pixel) in ‘super-resolution mode’ 
(0.675 A per pixel) at a dose rate of about nine electrons per pixel per second. Two 
movies were acquired per hole, and each movie encompassed a total dose of ~43 
electrons per square angstrém over 8s fractionated into 40 frames (0.2s each). 
Defocus values ranged from —0.6|1m to —3.1,1m. Movies were aligned and binned 
as previously described'>"?, except that images were not partitioned into quadrants. 

Unless otherwise noted, processing was performed using RELION 1.3 (ref. 34). 
Contrast transfer function (CTF) parameters were estimated using CTFFIND4 
(ref. 35). Initial 2D classes were generated after semi-automated picking of ~10,000 
particles (box size 204) using e2boxer.py (EMAN2)°*°. Sixteen distinct classes 
were low-pass filtered to 25 A resolution and used as templates for autopicking’”, 
resulting in 476,100 particles selected from 1,172 micrographs. The autopicked 
particles were subjected to manual screening followed by screening by 2D clas- 
sification, yielding an input data set of 409,401 particles. A previously published 
22-A-resolution cryo-negative stain reconstruction of human Pol II (EMD- 1282)* 
filtered to 50 A was used as an initial reference for 3D refinement. Before any 
3D classification, data were subjected to the particle polishing movie-processing 
algorithm of RELION 1.3 (ref. 38), resulting in an improvement in resolution from 
3.7Ato3.4A 

Three-dimensional classification was performed without image alignment as 
outlined in Extended Data Fig. 2. Masks were chosen to include either the entire 
Pol II EC ora smaller region of interest. The full data set was used as input for the 
classification of heterogeneity in the region of upstream DNA density. Only classes 
displaying strong clamp density were used as input for classification of conforma- 
tions of the RPB4-RPB7 stalk. After classification, data were again subjected to 3D 
refinement with a 50 A filtered reference volume. B-factors were automatically esti- 
mated in RELION™ and resolutions were reported on the basis of the gold-standard 
Fourier shell correlation (FSC) (0.143 criterion)*® as described*!. The Pol II EC1 
reconstruction was calculated from 264,134 particles to 3.4 A resolution and sharp- 
ened with a B-factor of —137 A*. The Pol II EC2 (improved RPB4—RPB7 stalk 
density) reconstruction was calculated from 219,265 particles to 3.6 A and sharp- 
ened with a B-factor of —128 A. The Pol II EC3 (improved upstream DNA density) 
reconstruction was calculated from 184,122 particles to 3.7 A and sharpened with 
a B-factor of —123 A?. 

Focused refinements were achieved by continuing a refinement of the full data 
set from the iteration at which local searches began, but replacing the mask encom- 
passing the entire Pol II EC density with a soft mask around the region of interest 
and allowing the refinement to continue to convergence. Local resolution was 
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calculated using a sliding window method as described!*”, except that a single 
pair of half maps was used per estimation and local resolution was not capped at 
the nominal value. Figures were generated using UCSF Chimera”. 

Model building and refinement. Alignments for each Pol I] subunit were 
generated using the Homo sapiens, Bos taurus, Drosophila melanogaster, 
Schizosaccharomyces pombe, and S. cerevisiae sequences followed by alignment 
using Clustal Omega“. The Pol II crystal structure PDB 4BBS* was chosen as a 
reference, as it displayed good stereochemistry and included the most complete 
model of the RPB2 protrusion domain. A starting model of ten-subunit Pol II was 
generated using the CCP4 (ref. 46) program chainsaw” along with the alignment of 
the B. taurus and S. cerevisiae sequences. Conserved amino acids were retained, and 
non-conserved amino acids were pruned back to the gamma atom. The starting 
model was placed in the Pol II EC density by fitting in UCSF Chimera’, followed 
by fitting of rigid body groups in COOT*. Groups for rigid body refinement were 
chosen on the basis of observations from Pol II X-ray studies and visual inspection 
of the initial fit to the density. In regions of sufficient density quality, the model 
was manually adjusted to include complete side chains, and missing or divergent 
regions were built in COOT using B-factor sharpened maps. Multiple maps were 
used during model building, including the densities for the refinement using all 
data, Pol II EC1, Pol II EC2, Pol II EC3, and focused refinements. 

To generate a complete EC model, one molecule (chains A and B) of the crystal 
structure of human RPB4-RPB7 (PDB 2C35)*” was docked into the Pol II EC2 
map. Regions near the ten-subunit core were adjusted manually to fit the den- 
sity. Amino-acid side chains (previously stubbed) were added to the model if side 
chain density was visible. Ideal B-form DNA was manually fitted into the Pol II 
EC3 upstream DNA density. To improve backbone geometry, the EC model was 
subjected to PHENIX real space refinement (global minimization and ADP refine- 
ment) into one of the three unsharpened EC maps using Ramachandran, rotamer, 
and nucleic-acid restraints*’. EC3 was used for refinement of the upstream DNA 
(chain N residues 1-13 and chain T residues 27-39). Because of the lower local res- 
olution of the distal end of the upstream DNA, residues 1-10 of the non-template 
strand and residues 29-39 of the template strand were replaced with ideal B-form 
DNA that had been aligned to the refined upstream nucleic acids. EC2 was used 
for refinement of RPB4 and RPB7, and EC] was used for the remaining model (EC 
body). The EC body was additionally refined as described above using the sharp- 
ened EC1 map to better position well-resolved side chains and nucleic acids within 
the density®. The final model was validated using Molprobity*!, EMRinger”’, 
and the FSC of the final model versus the EC1 map (Extended Data Fig. 3b). For 
model versus map FSC calculations, the EC1 map was masked using the RELION- 
generated soft automask used in postprocessing. 

Crosslinking and mass spectrometry. Nucleic-acid scaffold used to assemble 
Pol II EC complexes was modified to include a 50 nucleotide RNA (sequence 5/- 
GAACGAGAUCAUAACAUUUGAACAAGAAUAUAUAUACAUAAAGACCA 
GGC-3’), as previous data have shown that DSIF affinity for ECs is increased as 
RNA length increases°’. RNA was produced and purified as previously described™. 
A twofold molar excess of pre-annealed RNA and template DNA was incubated 
with Pol II for 20 min at 25°C, followed by incubation with fourfold excess of non- 
template DNA for an additional 20 min at 25°C. A fivefold molar excess of DSIF 
was incubated with the resulting Pol II EC for 20 min at 25°C. Pol II-DSIF EC sam- 
ple was applied to two consecutive Superdex 200 10/300 size-exclusion columns 
equilibrated in buffer B (150 mM NaCl, 5mM HEPES pH 7.25 at 25°C, 10,.M 
ZnCl, 10mM DTT). Purified Pol II-DSIF EC was concentrated to ~0.5 mg ml! 
(~0.74.M) and crosslinked with 3 mM BS3 (BS3-d0/d12, Creative Molecules) 
as described above. Crosslinked sample was again applied to two Superdex 200 
10/300 columns, resulting in ~ 251g material. Sample was digested with trypsin, 
and analysed as previously described!>». 

Negative stain electron microscopy of the Pol II-DSIF EC. Pol II-DSIF EC com- 
plexes were prepared as described above. Sample (~30j1g ml) was applied to 
glow-discharged 400 mesh copper grids coated with continuous carbon (Plano 
EM) for 1 min, washed on 500i water for 30s, floated for 20s on three consecutive 
20-11 drops of 2% uranyl formate stain, and blotted to remove excess stain. Data 
were collected using an FEI Tecnai Spirit operated at 120kV and a magnification 
of x90,600. Micrographs were collected using a defocus range from —1.0 to 
—1.51m with an FEI Eagle CCD (charge-coupled device) camera binned 2 
(image dimensions 2,048 pixels x 2,048 pixels) at a pixel size of 3.31 A. Semi- 
automatic picking using e2boxer.py (EMAN2) yielded 11,531 particles from 120 
micrographs. Data were subjected to 3D classification in RELION (eight classes, 
no CTF correction) using the cryo-negative stain reconstruction of human Pol II 
(EMD-1282)* low-pass filtered to 60 A as an initial reference. The two highest 
populated classes (comprising 85% of the data) were further classified into two 
classes each, for a total of four classes in which Pol II features beyond 60 A were 
recognizable. Two classes did not have discernible additional density compared 
with Pol II (42% of data). A third class (28% of the data) displayed additional 


density near the RPB4-RPB7 stalk. A final class (15% of the data) showed addi- 
tional density over the Pol II DNA binding cleft, as well as additional density near 
the RPB4-RPB7 stalk. Refinement of this subset of data (1,630 particles) resulted 
in a 3D reconstruction at 26A resolution, revealing extra density consistent with 
results from crosslinking coupled to mass spectrometry, previous DSIF-RNA 
crosslinking”, and the published interaction between SPT5 and the Pol II clamp 
coiled-coil motif*®. 

Transcription assays. Activity assays were performed as described*’, with mod- 
ifications. For reactions using fully complementary template and non-template 
DNA sequences, Pol II ECs were assembled stepwise beginning with either 12-sub- 
unit bovine Pol II or 12-subunit bovine Pol II in complex with human Gdown1, 
as indicated. Per reaction, Pol II was pre-assembled for 20 min at 28°C witha 
0.5 molar ratio of 5’ 6-FAM-labelled 20-nucleotide RNA annealed to template 
DNA, followed by incubation with a 1.0 molar ratio of fully complementary non- 
template DNA. The RNA and DNA sequences were the same as for the Pol II EC, 
except for an additional 46 nucleotides of downstream DNA. The template DNA 
sequence was 5’-ACAAATTACTGGGAAGTCGACTATGCAATACAGGCAT 
CATTTGATCAAGCTCAAGTACTTAAGCCTGGTCATTACTAGTACTGCC-3’; 
the non-template DNA sequence was 5’-GGCAGTACTAGTAATGACCAGG 
CTTAAGTACTTGAGCT TGATCAAATGATGCCTGTATTGCATAGTCGACT 
TCCCAGTAATTTGT-3’, and RNA sequence was 5’/-UAUAUGCAUAAAGA 
CCAGGC-3’. Transcription was allowed to proceed for 10 min at 30°C in the 
presence of 1-100,1M nucleoside triphosphates (NTPs) as indicated, and 0.2 pmol 
product per reaction was visualized on a 15% denaturing urea polyacrylamide gel. 
Transcription assays were also performed on the bubble scaffold used for structural 
studies. Pol II-EC complexes were prepared as described for cryo-EM, except that 
the samples were not crosslinked and the 20-nucleotide RNA used for assembly 
was 5/-labelled with 6-FAM. Assembled complex was incubated with 10-1,000|1M 
UTP at 30°C for 10 min, allowing the extension of the RNA by two additional 
nucleotides. Product was visualized on a 20% denaturing urea polyacrylamide gel 
and imaged using a Typhoon FLA 9500. 
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Extended Data Figure 1 | Bovine Pol II purification and transcription 
activity. a, SDS-polyacrylamide gel electrophoresis analysis of size- 
exclusion-purified 12-subunit bovine Pol I], recombinant Gdown1, and 
reconstituted complete Pol II with all 13 subunits (right). b, Extension 

of RNA by bovine Pol II. Activity assay was performed on Pol II ECs 
assembled on fully complementary DNA. Complexes were assembled with 
12-subunit Pol II (Pol II) or 12-subunit Pol I in complex with Gdown1 
(Pol II (G)). Transcription was allowed to proceed for 10 min at 30°C in 
the presence of the indicated concentration of NTPs. Purified endogenous 


0.2 pmol labelled RNA total 


bovine Pol II is active in transcription. Approximately 40% of the RNA 
appears to be extended in the presence of Gdown1. c, Extension of RNA by 
bovine Pol II-human Gdown1 on the bubble scaffold used for structural 
studies. Uncrosslinked Pol II EC was incubated with the indicated amount 
of UTP at 30°C for 10 min to allow the incorporation of two additional 
nucleotides. Approximately 30% of RNA is extended. d, Observation 

of individual Pol II ECs of the expected size under cryo-EM conditions 
with a K2 direct electron detection device. e, Fifty representative 2D class 
averages resulting from classification of the full Pol II EC data set. 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


Reconstruction from all data 
409,401 particles 


Top-front 2 Front 2 


oa 
‘J 


Classes 
1,2,3, 
5,6,7 


Mobile clamp- 7 A 
active site region Pol Il EC3 
184,122 particles 
Classes 3.7A 
4,8 
Focus on 
stalk 
EPS “ 
Pol Il EC1 
264,134 particles 
3.4A 
Focus on 
clamp 
and stalk 
Class 5 Class 6 Class 7 Class 8 
T% 8% 11% 12% 


A 
4 4 
Pol I EC2 
219,265 particles 
3.6A 
Extended Data Figure 2 | Computational sorting of cryo-EM particle site. Orange: RPB9, RPB1 upper jaw, and part of the RPB1 cleft. Purple: 
images. Particle images were sorted by 3D classification in RELION RPB4-RPB7 stalk. Blue: nucleic acids. For focused classifications, density 
to reveal additional features of the Pol II EC. To reduce computational for the region of interest was coloured. This corresponded to the region 
costs, all classifications were performed without image alignment. For within the classification mask, except for the classification of upstream 
global classifications, a mask encompassing the entire Pol II EC was used, DNA, in which case the density in the region of the entire nucleic-acid 
whereas for local classifications masks encompassed only the area of scaffold was coloured. The three reconstructions used for modelling the 
interest as indicated in the diagram. The percentage of total data included Pol II EC are shown in yellow, red, and blue boxes. The EC class displaying 
in each class is indicated above each panel. Density was coloured to a mobile jaw-lobe region is highlighted with an orange box, and the EC 
assist in visualization. For global classifications (upper left), density was class displaying a mobile clamp-active site region is highlighted with a 


coloured according to observed mobile regions. Green: clamp, anchor, part green box. 
of the RPBI cleft, part of the hybrid-binding region, and part of the active 
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Extended Data Figure 3 | Quality of the cryo-EM reconstructions. 

a, Angular distribution of particle images for the Pol II EC1, EC2, and 

EC3 reconstructions. Black shading indicates the number of particles 
assigned to a given view, while red dots indicate represented views (at 

least one particle was assigned within 1° of the point). b, FSC plots for the 
reconstructions described in a, as well as the FSC plot for the Pol II EC 
model versus the unsharpened Pol II EC1 map. The first data point after 
phase randomization is omitted. c, Top—front view and middle slice of the 
B-factor sharpened Pol II EC1 reconstruction coloured by local resolution. 
The top-front view (also shown in Extended Data Figure 2) is a rotation of 
either the top view or front view by 45°. d, Top-front view and middle slice 


Stalk focused refinement 


Age 
bent ER 


Protrusion 
focused 
refinement 


of the unsharpened Pol II EC2 reconstruction coloured by local resolution. 
e, Top-front view and middle slice of the unsharpened Pol II EC3 
reconstruction coloured by local resolution. f, RPB4-RPB7 stalk density 
for the unsharpened Pol II EC1, Pol I EC2, and stalk focused refinements 
shown at the same threshold level and filtered to the nominal resolutions 
(3.4A, 3.6 A, and 4.2 A). Continuous density for the stalk is visible in the 
lower resolution focused refinement. g, Density for the protrusion in the 
unsharpened Pol II EC1 and protrusion-focused refinements shown at 
the same threshold level and filtered to the nominal resolutions (3.4 A 
and 3.8 A). Density for the protrusion tip is more visible in the focused 
refinement map. 
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Extended Data Figure 4 | Structural comparison of bovine and yeast 
Pol IT. a, Structures of the bovine and yeast EC (PDB 1Y1W) were 
superimposed by alignment of the active site. The protein regions of the 
bovine EC are shown in two views, coloured by r.m.s.d. of the yeast and 
bovine models. Insertions and uniquely structured regions within the 
bovine EC are coloured purple. b, Surface conservation between yeast 
and bovine Pol II. Strong and weak conservation groups were assigned 
according to Clustal conventions. c, Structural rearrangements within the 
RPB5 jaw domain. Axes were drawn through the four longest helices of 
the bovine and yeast RPB5 jaw using UCSF Chimera. d, Comparison of 
the backbone path of the trigger loop to previously known trigger loop 
conformations. The trigger loop backbone of the bovine Pol II EC is shown 
in thick, dark grey; the open conformation of the yeast EC (PDB 1Y1W)”° 
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is in blue; the closed conformation of the yeast EC in the presence of 

GTP (PDB 2E2H)”* is in yellow; the wedged conformation of the yeast 

EC in the presence of «-amanitin (PDB 2VUM)** is in dark green; the 
trapped conformation of the arrested and backtracked yeast EC (PDB 
3P02)°* is in purple; the locked conformation of the yeast EC reactivation 
intermediate in the presence of TFIIS (PDB 3PO3)*? is in orange; and 

the new open position observed in the ten-subunit yeast transcription 
complex crystallized in the presence of TFIIF is in light green. The binding 
site for a-amanitin is denoted in dashed green lines. The binding site for 
the incoming NTP is outlined in dashed yellow lines. e, Comparison of the 
path of DNA in the closed initiation complex (yellow) and the elongation 
complex (blue, this study). Downstream DNA was extended. 
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Extended Data Figure 5 | Modelled position of the human SPT5 NGN 
domain on the bovine Pol II EC. a, Zoom-in view of Pol II EC3 nucleic- 
acid density superimposed on the Pol II-SPT4-SPT5-NGN EC model 
reveals proximity of non-template DNA to the SPT5 NGN domain. The 
human SPT4-SPT5-NGN crystal structure (PDB 3H7H)*! was positioned 
on the Pol II EC model by alignment with the archaeal RNA polymerase 


Upstream 
DNA ~s 
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clamp-SPT4-SPT5 crystal structure (PDB 3QQC)”*. b, Colouring a 
surface representation of the X-ray structure of the SPT4-SPT5-NGN 

by Coulombic surface charge reveals positively charged patches in close 
proximity to non-template and upstream DNA. Positive charge is in blue; 
negative charge is in red. 
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Extended Data Table 1 | Model refinement statistics 


Map CC (whole unit cell) 0.835 
rmsd (bonds) 0 
rmsd (angles) 0.73 


All-atom clashscore 14.17 


Ramachandran plot 


outliers 0.00% 
allowed 5.08% 
favored 94.92% 
Rotamer outliers 0.00% 
C-beta deviations 0 
EMRinger score 2.62 
Molprobity score 2.01 
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Extended Data Table 2 | Summary of differences between the yeast EC (X-ray, PDB 1Y1W) and bovine EC (cryo-EM) 
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Clamp core 
Clamp head 
Clamp head 
Clamp head 
Dock 
Pore 1 
Foot 
Cleft 
Jaw 
Jaw 
Jaw 
Cleft 
Protrusion 
Protrusion 
Lobe 
Lobe 
Lobe 
Fork loop 1 
Fork loop 2 
External 1 
External 1 
External 1 
Wall 
Wall 
Domain 2 
Loop 
Loop 
Tail 


Jaw 
Zinc ribbon 2 
Zinc ribbon 2 
RPB12 


Sequence range 


B. taurus (S. cerevisiae) 


36-84 (32-80) 
120-136 (116-130) 
156-180 (150-163) 
203-212 (186-198) 
431-444 (417-430) 
604-622 (590-599) 
908-946 (885-977) 

1144-1150 (1121-1127) 

1171-1300 (1148-1275) 

1190-1205 (1167-1186) 

1262-1280 (1243-1255) 

1312-1330 (1287-1300) 

68-81 (72-85) 
133-144 (137-162) 
200-392 (218-405) 
209-215 (227-234) 
242-252 (261-265) 
453-466 (466-479) 
490-496 (503-509) 
630-632 (642-650) 
645-650 (663-678) 
684-690 (712-735) 
820-827 (865-872) 
832-843 (877-888) 
123-147 (121-140) 
197-202 (191-200) 
209-222 (207-221) 
264-271 (263-270) 

45-52 (47-54) 

65-70 (65-76) 

76-127 (82-132) 

16-20 (16-18) 

33-37 (31-36) 

63-84 (62-89) 
102-110 (107-111) 
129-138 (129-138) 

12-54 (2-44) 

63-75 (53-65) 
82-121 (72-108) 

15-16 (25-28) 


Sequence change 


no insertion or deletion 
a3-a4 insertion 
a4-B4 insertion 
B4-B5 deletion 
no insertion or deletion 
B20-B21 insertion 
no insertion or deletion 
no insertion or deletion 
insertion and deletion (below) 
a40-B29 deletion 
B31-a43 insertion 
B32-B33 insertion 
no insertion or deletion 
B2-B3 deletion 
B7-B8 deletion 
B9-B10 insertion 
no insertion or deletion 
no insertion or deletion 
B21-a16 deletion 
a16-a17 deletion 
a19-B24 deletion 
no insertion or deletion 
no insertion or deletion 
87-88 insertion 
B10-B11 deletion 1 
810-811 deletion 2 
no insertion or deletion 
a3-81 insertion 
a4 deletion 
no insertion or deletion 
B2-B3 insertion 
B3-84 deletion 
B5-86 deletion 
B7-B8 insertion 
B9-B10 insertion 
N-terminal insertion 
no insertion or deletion 


87-B8 insertion 


Structural change 


conformational change 
conformational change of a3 and connecting loops 
loop expansion, conformational change 
conformational change 
conformational change 
loop expansion, helix addition, conformational change 
conformational change 
conformational change, helix addition 
domain-wide movement of loops and a-helices 
loop shortening, a40 conformational change 
loop expansion 
loop insertion, interruption of B32 and B33, conformational change 
conformational change 
conformational change, B-strand addition 
domain movement 
loop shortening, conformational change 
loop expansion 
conformational change 
conformational change 
loop shortening 
loop and a16 shortening 
loop shortening 
conformational change 
conformational change 
loop expansion, conformational change 
loop shortening, conformational change 
conformational change 
extension of a3 
conformational change 
loop shortening, a4 deletion 
conformational change 


loop expansion, B2 shortening 


loop shortening, conformational change 


loop shortening, conformational change 
loop expansion 
conformational change 
domain movement 
conformational change 


conformational change 


N-terminal deletion conformational change 
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Faking it 


Inthe face of routine rejection, many scientists must learn to 
cope with the insidious beast that is impostor syndrome. 


BY CHRIS WOOLSTON 


very early-career scientist has hit a 
Hissin block: a bad grade on an exam, 

a low score on a grant proposal or the 
first rejection from a journal. But for many, this 
normal stuff of science twists into something 
darker and more insidious — a creeping sense of 
professional inadequacy that prompts them to 
question their place in the field, no matter their 


area of study or their level of brilliance. 

There is a term for this type of self-doubt, 
coined in the 1970s by two US psycholo- 
gists who saw it in their clinical practices: 
the ‘impostor phenomenon. Now commonly 
known as impostor syndrome, the condition 
can manifest itself in myriad professions, 
from office workers to artists to athletes, says 
Frederik Anseel, a psychologist at Ghent 
University in Belgium. Scientists, he says, are 


especially vulnerable, largely because they 
work in a hero-oriented field that treats its 
highest achievers as if they were sports stars, 
leaving many others to wonder in silence 
whether they are second-stringers or worse. 
“Young people think that no one else is having 
these feelings,” he says. 

Researchers who struggle with the syndrome 
have to learn how to tune out feelings of inad- 
equacy and develop a more realistic view of their 
abilities and their value, he says. Ina profession 
where sporadic failure — in grants, in jobs, in 
publications — is the norm, the real failure is 
unnecessarily giving up on a promising career. 


UNHAPPY SUCCESSES 

In 2014, Anseel and his colleagues took a closer 
look at impostor syndrome in a study of more 
than 200 Belgian workers in finance, education 
and human-resource management. The team 
found that workers who reported feelings that 
are consistent with impostor syndrome tended 
to score higher on measures of neuroticism 
and excessive perfectionism in personality tests 
(J. Vergauwe et al. J. Bus. Psychol. 30, 565-581; 
2015). They were also not as happy with their 
jobs as were colleagues who did not experi- 
ence the syndrome — even though some of the 
afflicted had advanced to the upper levels of 
their professions. 

Anseel says that his other work — which 
includes ongoing studies of mental-health 
issues among young researchers — gives him 
confidence that his findings about impostor 
syndrome in the white-collar world apply to 
science as well. He says that it is easy to see how 
even successful scientists can feel that they are 
actually underperformers. Scientists, he says, 
often trivialize their own achievements. “You 
get a paper published in PNAS, and you tell 
yourself, “That's doable. I'll never get a paper in 
Nature or Science.” Similarly, any grant could 
be larger; any job could be better; any paper 
could be more highly cited. “You set yourself 
up to fail one way or the other,” he says. 

The phenomenon shows up across 
academia, including at top research institu- 
tions. Josh Drew, an evolutionary ecologist at 
Columbia University in New York City, has 
seen PhD and master’s degree students strug- 
gle with self-doubt at the Ivy League school. 
Every student had passed tough admission 
standards — but that was not enough to bolster 
their confidence. For many, their classes at the 
university represented the first time in their 
educational experience that they didn't feel 
as if they were the smartest person inthe > 
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> room. “They were all outstanding students 
as undergrads,’ he says. “Here, being at the top 
of your class is just average.” 

Ina highly competitive arena, self-doubt can 
be a career killer that prompts would-be con- 
tenders to dismiss chances to vie for important 
opportunities. “I saw many students who were 
shooting themselves in the foot,’ Drew says. 
“They werent applying for grants and awards 
that they would be competitive for” He began 
to address the syndrome in an introduction- 
to-graduate-school class. The talks drew 
some buzz, and he soon developed a formal 
presentation to deliver to other departments 
at Columbia and beyond (see ‘Help for impos- 
tors’). Clearly, he had struck a chord. “Every 
talk I give, people say, ‘I thought I was the only 
person who felt this way;,” he says. 


TIMELESS CONDITION 
Drew reassures people who feel like frauds by 
pointing out that they are in some lofty com- 
pany. Two years after publishing On the Origin 
of Species in 1859, Charles Darwin complained 
that “one lives only to make blunders”. And 
while working on The Grapes of Wrath (1939), 
John Steinbeck wrote, “Tam assailed by my own 
ignorance and inability,’ fretting that “some- 
times, I seem to doa good little piece of work, 
but when it is done it slides into mediocrity.” 
While preparing his lecture, Drew solicited 
Twitter comments from scientists who had 
struggled to overcome the syndrome with 
various degrees of success. One respondent, an 
associate professor of biology, tweeted: “It has 
crippled my professional life from day one” 
Moses Milazzo, a planetary scientist with the 
Astrogeology Science Center in Flagstaff, Ari- 
zona, tweeted, “Because of Impostor Syndrome: 
Ihave decided not to pursue opportunities; lam 
never ready to publish my papers; etc.” 


Ecologist Josh Drew loads samples of reef fish into 
a tank of liquid nitrogen in Nagigi, Fiji. 


SOCIAL SUPPORT 


Help for impostors 


Evolutionary ecologist Josh Drew of 
Columbia University in New York City 
hosts a talk called ‘Fighting the Impostor 
Syndrome’, which offers tried-and-tested 
coping strategies for the common condition. 
Ata basic level, he urges researchers to 
advocate for themselves. That means 
avoiding words such as ‘just’ and ‘only’ 
when describing their own work, and not 
constantly apologizing for every mistake, 
whether real or perceived. He says that 
offering real support to someone else who 
feels wracked by doubt is a quick and 


Milazzo thinks that impostor syndrome is 
nearly universal among scientists — at least 
among those who are self-aware enough to 
realize that they don’t know everything. But 
he also says that he experienced particularly 
severe effects of it, and he traces at least some 
of his unease to his background. He grew up 
in an off-the-grid adobe house on the border 
of the Navajo Nation reservation in northern 
Arizona, which later contributed to his sense 
that he did not belong in the university crowd. 
Not many of his instructors, advisers or peers 
could relate to hardscrabble desert life in a 
house that had no reliable electricity and little 
contact with the outside world. 

“T didn’t even know what Cosmos was until 
I got to grad school, he says of the popular 
1980s TV show. Then again, life in middle- 
of-nowhere Arizona did give him an intimate 
familiarity with the night sky. During the long 
walk along a dirt road to the school bus stop, he 
often navigated by the light of the Milky Way. 

Milazzo says that he first heard the term 
impostor syndrome early in his graduate- 
student days at the University of Arizona in 
Tucson — and he recognized it immediately. 
“Having a name put to it made it clear that 
other people felt it,” he says. But knowing 
that he was not alone didn’t keep him out of 
the trap. He decided not to apply for a NASA 
grant for satellite-based research, out of fear 
of exposing his own ignorance. “I removed 
myself from the grant process because it would 
be obvious that I didn’t know what I was talk- 
ing about, he says. As it turned out, one of the 
successful proposals was very similar to his 
idea. “If I had pursued it, I might have been 
competitive,’ he laments now. 

And even after helping to win a NASA grant 
to develop a middle-school curriculum that 
would be based on the space agency’s explora- 
tion of the Solar System, Milazzo still struggles 
to persuade himself that he belongs in science. 
“We put a lot of work into that proposal, but I 
wasnt very confident that it would get funded,’ 
he says. 
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effective way to improve your own sense of 
belonging. To really pick yourself up, bring 
someone else up with you. 

Drew says that members of groups that 
are underrepresented in science often 
benefit by reaching out to others and finding 
a community. For some, simply following a 
Twitter feed such as #BLACKandSTEM or 
#womenandSTEM can serve as reassurance 
that they really do belong in science. His 
message? “You’re not here because you 
ticked some box. You’re here because you 
bring a lot to the department.” C.W. 


Matt von Hippel, a researcher at the 
Perimeter Institute for Theoretical Physics in 
Waterloo, Canada, says that he, too, feels like an 
impostor from time to time, but he has a strat- 
egy that helps him to push through it. Instead of 
second-guessing the people who admitted him 
to graduate school and awarded him a PhD, he 
decided to embrace their judgement. 

“You can trust the system to have put you in 
vaguely the right job,” he says. “If you're invited 
to give a talk, that’s a sign that you're ready to 
give a talk.” Late last year, he was asked to give 
a colloquium on mathematical techniques in 
particle physics at Oregon State University in 
Corvallis. It was a big opportunity that came 
sooner in his career than he expected, and he 
thought about turning it down. Ultimately, he 
opted to adhere to his strategy. “I decided to say 
‘yes’ and see how it goes,’ he says. In his view, 
the talk was a success. 


UPHILL BATTLE 
Biologist Victoria Metcalf had plenty of 
opportunities to doubt herself and second- 
guess her career choices. Her low point came 
in early 2000 during her PhD studies in New 
Zealand, when television-news crews sur- 
rounded her lab and a regulatory authority 
threatened to throw her and her supervisors 
in prison. Her lab had cloned genes from the 
tuatara (Sphenodon punctatus) — a treasured 
native New Zealand reptile — but lacked the 
permits that a new law had retroactively made 
mandatory. Authorities eventually dropped 
their threats, but her research was stalled for 
six months while she obtained the proper 
permits. “Those were really soul-destroying 
times,” she says. “It had a huge impact on how 
I perceived my worth in academia.” 
Scientists are accustomed to measuring 
things in precise detail, but their own value 
can be difficult for them to quantify. Anseel 
thinks that many researchers would be more 
confident — and thus more likely to write the 
grant, submit the paper, apply for the job — 
if they were to embrace the inevitability of 
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failure. “When one of my students gets a 
rejection letter, I can show them five or 
ten of my own, he says. “The academic 
environment should be more open to 
failure stories.” 

Drew reminds young researchers that 
even the chairs of their departments — sci- 
entists who seemingly have it made — do 
not always get their grants funded or their 
papers accepted. It would be telling, he says, 
if everyone published a ‘shadow CV ofall 
their rejections to go along with the stand- 
ard CV that lists successes. 

Researchers can also help to ease their 
distress by making an effort to stop com- 
paring themselves with colleagues in their 
lab or department. “Comparisons won't 
make you happy, so don’t do it,” Anseel says. 
Instead, he says, researchers should set their 
own personal standards of achievement 
and then do their best to meet them. 

Metcalf has mostly won her battle over 
her sense of inadequacy, although her 
career has had its ups and downs. After she 
earned her PhD, she took a postdoc posi- 
tion in the United States that she quit after 
only six months, 
an outcome that 
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went on to have a successful career that 
included research trips to the Antarctic 
and a highly sought-after faculty position 
at Lincoln University in Christchurch, 
New Zealand. 

Yet her troubles didn't end. In 2011, she 
lost her faculty job after an earthquake 
damaged much of the city. Instead of tak- 
ing that setback as a sign that she needed 
to abandon science completely, she shifted 
from research to outreach. She is now the 
national coordinator of the Participatory 
Science Platform, a New Zealand govern- 
ment programme that promotes research 
collaborations between scientists and com- 
munities. “Anyone who knows me knows 
that I was meant for this job,” she says. 

As part of her duties, Metcalf has had 
many chances to speak to young people 
with different backgrounds and career 
aspirations. Many of them are already 
experiencing the symptoms of impostor 
syndrome, which gives her an opportunity 
to inspire by example. “My story really reso- 
nates,” she says. “I’ve had my battles. You 
just have to keep fighting.” m 


Chris Woolston is a freelance writer in 
Billings, Montana. 


TURNING POINT 
Louis Picker 


Louis Picker is not afraid to break with 
convention. Trained as a pathologist, he was 
on the front line when the AIDS epidemic 
emerged in the 1980s. He is now combining his 
interests in immunology and viruses to pursue 
an unusual HIV vaccine at Oregon Health and 
Science University (OHSU) in Portland — a 
project that was considered a fool’ errand by 
many when he began. 


How did you get started in research? 

Ihad always wanted to be a scientist. I started 
an MD-PhD programme at the University of 
California, San Francisco, but found it much 
too slow, rigid and hierarchical. I left that 
programme, but did a year of research there. 
Ultimately, I decided to become a pathologist 
specializing in immunology. It’s astonishing 
how much biology you can learn from look- 
ing at hundreds of biopsy slides and by per- 
forming autopsies every day. I got a feel for the 
immune system that you couldnt get by doing 
graduate research on a mouse. 


Describe your first AIDS autopsy. 

I was a pathology resident at Beth Israel 
Hospital in Boston, Massachusetts. The 
devastation left by AIDS stuck with me. I 
decided to learn more about the disease so that 
I could do something about it one day. I had 
the opportunity to move into HIV research in 
the mid-1990s and haven't looked back since. 


What led you to HIV-vaccine research? 

Early in my career, I worked on a flow- 
cytometry-based assay to measure specific 
T-cell responses to viral infection in humans. 
I chose to work with cytomegalovirus (CMV), 
a virus that infects around 50% of adults in the 
United States and triggers a T-cell response 
that lasts throughout a person's lifetime. These 
factors enabled me to test the specificity of the 
assay. After studying CMV-specific T cells, I 
hypothesized that CMV could be exploited to 
create a vaccine that stimulates an immediate 
immune response to a variety of pathogens. 
By incorporating bits of HIV into the vaccine, 
we could prime T cells to hit the intruding 
virus early and hard. Our data in non-human 
primate models show that the vaccine stops 
infection with the simian counterpart of HIV 
in slightly more than half of recipients. 


What does the next year hold for you? 

We will move into clinical trials with our 
potential HIV vaccine. We are also explor- 
ing the use of unconventional viral vectors 
to manipulate the immune system against 
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tuberculosis, malaria, hepatitis B and cancer 
at a level heretofore unappreciated. 


Why did you choose research over more- 
lucrative private practice? 

I knew that if I wanted to make a difference 
— and to pursue the CMV-based vaccine while 
others focused on conventional antibody-led 
approaches — I had to do lab-based experi- 
ments. As a pathologist, I would never have 
had access to patients. The best way to do 
relevant science was to test my ideas in a non- 
human primate model. The job I took at the 
OHSU was one of two possibilities I had at the 
time to do that type of work. 


How easy was it to pursue your idea? 

I was fortunate to have negotiated a start- 
up package at the OHSU that gave me the 
leeway to gamble. Either 'd make it or break 
it. | was warmly welcomed by researchers in 
the HIV field, which I appreciated. But it took 
mea while to feel that I fit in. Self-doubt was a 
powerful driver for me. 


How risky was your decision? 

To be honest, it helped that I had an MD. I 
knew I would always be able to get a job as 
a physician, so the degree allowed me a little 
more freedom in the early years. In the first 
crucial years while I was establishing myself, 
I figured I could always return to pathology. 
Most people with PhDs don’t have that option. 


What makes a great scientist? 
You have to be a little bit of a lunatic. But your 
out-of-the-box thinking also has to be right. m 


INTERVIEW BY VIRGINIA GEWIN 


This interview has been edited for length and clarity. 
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ROBOT BURIAL 


BY H. E. ROULO 


anders stared across the 
Se: hillside, resting 

a hand on his shovel. The 
horizon wavered in the heat. 
He sighed. Maybe he shouldn't 
have answered when Richie 
texted, but they had been chums 
in school. Anyway, it was too 
late to back out now. Sweat 
trickled behind his ears and 
dripped onto the stiff collar of 
his white dress shirt. He swiped 
at the drops. 

The crunch of Richie’s shovel 
stilled and he looked up from 
digging. Hazy grey dust filled the 
brackets around his mouth. He 
had hardly spoken since Sanders 
had arrived at their old bonfire 
spot, except to point to the extra 
shovel leaning against the twisted 
pine tree. A tree that, by the way, 
cast almost no shade. 

He noticed Richie watching 
him with a scowl and frowned 
back. “What?” 

Richie licked his lips and finally spoke. 
“Are you making fun of me in that get-up? 
A suit?” 

“No, I just heard the word burial, and it 
was instinctive.” Sanders closed his eyes 
and took a deep breath. The whole thing 
certainly had the grimness of a funeral. He 
raised the wooden shaft of his shovel out of 
the hole they stood in and tapped the robot's 
padded frame, hearing the metal clang. “I’ve 
never buried a robot before.” 

Richie's face turned red, like hed finally 
realized how he had sounded when he 
called and asked Sanders to meet him in 
the middle of nowhere for a rush burial. 
Who could blame Sanders for being rat- 
tled? What a relief to confirm the body 
really was a powered-off jumble of foam, 
plastic and metal. 

Richie jumped out of the hole and 
grabbed a sports bottle, taking a long pull of 
blue sugar-water. He snapped the bottle’s cap 
back into place and nearly set the bottle on 
the frame of the robot but fumbled, placing 
it on the ground instead. 

“That’s a change from what we used to 
drink out here.’ Relieved they’ stopped 
digging, Sanders leaned on his shovel, 
wondering if Richie was ready to talk. Even 
when they were in school together hed kept 


A moment to reflect. 


his thoughts 

private, and Sanders was 
the one who filled the silences. Some things 
never change. “You can still see bottles piled 
at the base of the tree. Looks like kids don’t 
come out here anymore, though.” 

Richie examined the tree, perhaps think- 
ing of girls he had kissed under its branches, 
back when they were struggling students 
looking for a night away from it all. He 
mumbled: “Those were good times.” 

Not like today, seemed to go unsaid. 

“Maybe the suit was silly,’ Sanders admit- 
ted. “It’s not like I expected wed meet at a 
cemetery. What graveyard is going to allow 
a robot to be buried there? And it’s not like 
the robot was religious, am I right?” 

Richie chuckled. 

Encouraged, Sanders said: “She was a 
household robot, right? They’re different 
now, but I remember one like that from 
when I was little” 

“My folks were always early adopters. The 
robot was bought to assist my mother with 
chores even before her diagnosis.” 

“Gotcha. Old story. I suppose she ended 
up being a nanny to you? You spent more 
time with her than your parents?” 

Richie's lips tight- 
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Folks always did for pets, why not for 
robots? Especially if they talk?” Sand- 
ers said. “I bet we end up with robot 
cemeteries like they have pet ceme- 
teries. We should invest in one. Wed 
make a quick buck for sure” 
Richie shook his head. “No, 
she wasn’t for me. As Mom’s 
muscles grew weaker, the robot 
could be trained to take over 
her burdens. Pretty soon, she 
was taking care of Mom, too. 
As Mom got older, the robot 
was the only one who knew 
exactly what she wanted. 

Maybe I should have been 
there more, but I knew 
shed be taken care of” 

Uncertain of where 
this was going, Sanders 
leaned over to exam- 
ine the robot. “She's 
dinged up, but are you 
sure she’s dead?” 
Richie shook himself like a dog. 
“She’s a collection of Mom’s quirks, and 
now that Mom’s dead, she doesn't have a 
purpose. Too old to retrain, and too beat- 
up to sell. I called several dealers. They 
said they would buy the parts.” With his 
left hand, he reached out to the prone 
robot and folded flesh-toned padding into 
place. “But after everything she’s done, it 
would be like selling my sister for parts. 
She didn’t just do the things my parents 
didn’t want to do, she did what I didn’t 
want to do, too. Mom and I both got to 
keep nearly normal lives for a long time 
after she was disabled ... 1 completed my 
degree because of her. 

“Tkept her running for a while, but I don't 
want her to be the part of my mother that 
lasts the longest. It’s time.” 

Sanders was nodding. “So, we bury the 
robot.’ 

“Yeah,” Richie picked up his shovel. “But 
Tl have to run back to my house for some- 
thing.” 

Reluctantly, Sanders picked up his shovel. 
“What’s that?” 

“My good suit.” = 


H. E. Roulo lives in Seattle. She has 
released a dozen short stories, including 
‘Immeasurable’ in Nature previously. 
She recently released the first book in 
her Plague Master series. Find more at 
www.heroulo.com. 
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